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POLYKETIDE SYNTHASE ENZYMES AND RECOMBINANT DNA 
CONSTRUCTS THEREFOR 



5 Field of the Invention 

The present invention relates to polyketides and the polyketide synthase (PKS) 
enzymes that produce them. The invention also relates generally to genes encoding PKS 
enzymes and to recombinant host cells containing such genes and in which expression of 
such genes leads to the production of polyketides. The present invention also relates to 
10 compounds useful as medicaments having immunosuppressive and/or neurotrophic 
activity. Thus, the invention relates to the fields of chemistry, molecular biology, and 
agricultural, medical, and veterinary technology. 

Background of the Invention 

15 Polyketides are a class of compounds synthesized from 2-carbon units through a 

series of condensations and subsequent modifications. Polyketides occur in many types 
of organisms, including fungi and mycelial bacteria, in particular, the actinomycetes. 
Polyketides are biologically active molecules with a wide variety of structures, and the 
class encompasses numerous compounds with diverse activities. Tetracycline, 

20 erythromycin, epothilone, FK-506, FK-520, narbomycin, picromycin, rapamycin, 
spinocyn, and tylosin are examples of polyketides. Given the difficulty in producing 
polyketide compounds by traditional chemical methodology, and the typically low 
production of polyketides in wild-type cells, there has been considerable interest in 
finding improved or alternate means to produce polyketide compounds. 

25 This interest has resulted in the cloning, analysis, and manipulation by 

recombinant DNA technology of genes that encode PKS enzymes. The resulting 
technology allows one to manipulate a known PKS gene cluster either to produce the 
polyketide synthesized by that PKS at higher levels than occur in nature or in hosts that 
otherwise do not produce the polyketide. The technology also allows one to produce 

30 molecules that are structurally related to, but distinct from, the polyketides produced 
from known PKS gene clusters. See, e.g., PCT publication Nos. WO 93/13663; 
95/08548; 96/40968; 97/02358; 98/27203; and 98/49315; United States Patent Nos. 
4,874,748; 5,063,155; 5,098,837; 5,149,639; 5,672,491; 5,712,146; 5,830,750; and 
5,843,71 8; and Fu et a/ M 1994, Biochemistry 33: 9321-9326; McDaniel et 1993, 

35 Science 262: 1546-1550; and Rohr, 1995, Angew. Chem. Int. Ed. Engl 34(8): 881-888, 
each of which is incorporated herein by reference. 

1 
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Polyketides are synthesized in nature by PKS enzymes. These enzymes, which 
are complexes of multiple large proteins, are similar to the synthases that catalyze 
condensation of 2-carbon units in the biosynthesis of fatty acids. PKSs catalyze the 
biosynthesis of polyketides through repeated, decarboxylative Claisen condensations 
5 between acylthioester building blocks. The building blocks used to form complex 
polyketides are typically acylthioesters, such as acetyl, butyryl, propionyl, malonyl, 
hydroxymalonyl, methylmalonyl, and ethylmalonyl CoA. Other building blocks include 
amino acid like acylthioesters. PKS enzymes that incorporate such building blocks 
include an activity that functions as an amino acid ligase (an AMP ligase) or as a non- 

10 ribosomal peptide synthetase (NRPS). Two major types of PKS enzymes are known; 
these differ in their composition and mode of synthesis of the polyketide synthesized. 
These two major types of PKS enzymes are commonly referred to as Type I or 
"modular" and Type II "iterative" PKS enzymes. 

In the Type I or modular PKS enzyme group, a set of separate catalytic active 

15 sites (each active site is termed a "domain", and a set thereof is termed a "module") exists 
for each cycle of carbon chain elongation and modification in the polyketide synthesis 
pathway. The typical modular PKS is composed of several large polypeptides, which can 
be segregated from amino to carboxy termini into a loading module, multiple extender 
modules, and a releasing (or thioesterase) domain. The PKS enzyme known as 6- 

20 deoxyerythronolide B synthase (DEBS) is a Type I PKS. In DEBS, there is a loading 

module, six extender modules, and a thioesterase (TE) domain. The loading module, six 
extender modules, and TE of DEBS are present on three separate proteins (designated 
DEBS-1, DEBS-2, and DEBS-3, with two extender modules per protein). Each of the 
DEBS polypeptides is encoded by a separate open reading frame (ORF) or gene; these 

25 genes are known as eryAI, eryAII, and eryAIII. See Caffrey et aL y 1992, FEES Letters 
304: 205, and U.S. Patent No. 5,824,513, each of which is incorporated herein by 
reference. 

Generally, the loading module is responsible for binding the first building block 
used to synthesize the polyketide and transferring it to the first extender module. The 

30 loading module of DEBS consists of an acyltransferase (AT) domain and an acyl carrier 
protein (ACP) domain. Another type of loading module utilizes an inactivated 
ketosynthase (KS) domain and AT and ACP domains. This inactivated KS is in some 
instances called KS Q , where the superscript letter is the abbreviation for the amino acid, 
glutamine, that is present instead of the active site cysteine required for ketosynthase 

35 activity. In other PKS enzymes, including the FK-506 PKS, the loading module 
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incorporates an unusual starter unit and is composed of a CoA ligase like activity 
domain. In any event, the loading module recognizes a particular acyl-CoA (usually 
acetyl or propionyl but sometimes butyryl or other acyl-CoA) and transfers it as a thiol 
ester to the ACP of the loading module. 
5 The AT on each of the extender modules recognizes a particular extender-CoA 

(malonyl or alpha-substituted malonyl, i.e., methylmalonyl, ethylmalonyl, and 2- 
hydroxymalonyl) and transfers it to the ACP of that extender module to form a thioester. 
Each extender module is responsible for accepting a compound from a prior module, 
binding a building block, attaching the building block to the compound from the prior 

10 module, optionally performing one or more additional functions, and transferring th^ 
resulting compound to the next module. 

Each extender module of a modular PKS contains a KS, AT, ACP, and zero, one, 
two, or three domains that modify the beta-carbon of the growing polyketide chain. A 
typical (non-loading) minimal Type I PKS extender module is exemplified by extender 

1 5 module three of DEBS, which contains a KS domain, an AT domain, and an ACP 

domain. These three domains are sufficient to activate a 2-carbon extender unit and*"' 
attach it to the growing polyketide molecule. The next extender module, in turn, is 
responsible for attaching the next building block and transferring the growing compound 
to the next extender module until synthesis is complete. 

20 Once the PKS is primed with acyl- and malonyl-ACPs, the acyl group of the 

loading module is transferred to form a thiol ester (trans-esterification) at the KS of the 
first extender module; at this stage, extender module one possesses an acyl-KS and a 
malonyl (or substituted malonyl) ACP. The acyl group derived from the loading module 
is then covalently attached to the alpha-carbon of the malonyl group to form a carbon- 

25 carbon bond, driven by concomitant decarboxylation, and generating a new acyl-ACP 
that has a backbone two carbons longer than the loading building block (elongation or 
extension). 

The polyketide chain, growing by two carbons each extender module, is 
sequentially passed as covalently bound thiol esters from extender module to extender 
30 module, in an assembly line-like process. The carbon chain produced by this process 

alone would possess a ketone at every other carbon atom, producing a polyketone, f*om 
which the name polyketide arises. Most commonly, however, additional enzymatic 
activities modify the beta keto group of each two carbon unit just after it has been added 
to the growing polyketide chain but before it is transferred to the next module. 



3 



BNSDOCID; <WO 002060 1A2 I > 



WO 00/20601 PCT/US99/22886 

Thus, in addition to the minimal module containing KS, AT, and ACP domains 
necessary to form the carbon-carbon bond, and as noted above, other domains that 
modify the beta-carbonyl moiety can be present. Thus, modules may contain a 
ketoreductase (KR) domain that reduces the keto group to an. alcohol. Modules may also 

5 contain a KR domain plus a dehydratase (DH) domain that dehydrates the alcohol to a 
double bond. Modules may also contain a KR domain, a DH domain, and an 
enoylreductase (ER) domain that converts the double bond product to a saturated single 
bond using the beta carbon as a methylene function. An extender module can also 
contain other enzymatic activities, such as, for example, a methylase or dimethylase 

10 activity. 

After traversing the final extender module, the polyketide encounters a releasing 
domain that cleaves the polyketide from the PKS and typically cyclizes the polyketide. 
For example, final synthesis of 6-dEB is regulated by a TE domain located at the end of 
extender module six. In the synthesis of 6-dEB, the TE domain catalyzes cyclization of 

1 5 the macrolide ring by formation of an ester linkage. In FK-506, FK-520, rapamyciri, and 
similar polyketides, the TE activity is replaced by a RapP (for rapamycin) or RapP like 
activity that makes a linkage incorporating a pipecolate acid residue. The enzymatic 
activity that catalyzes this incorporation for the rapamycin enzyme is known as RapP, 
encoded by the rapP gene. The polyketide can be modified further by tailoring enzymes; 

20 these enzymes add carbohydrate groups or methyl groups, or make other modifications, 
i.e., oxidation or reduction, on the polyketide core molecule. For example, 6-dEB is 
hydroxylated at C-6 and C-12 and glycosylated at C-3 and C-5 in the synthesis of 
erythromycin A. 

In Type I PKS polypeptides, the order of catalytic domains is conserved. When 
25 all beta-keto processing domains are present in a module, the order of domains in that 

module from N-to-C-terminus is always KS, AT, DH, ER, KR, and ACP. Some or all of 
the beta-keto processing domains may be missing in particular modules, but the order of 
the domains present in a module remains the same. The order of domains within modules 
is believed to be important for proper folding of the PKS polypetides into an active 
30 complex. Importantly, there is considerable flexibility in PKS enzymes, which allows for 
the genetic engineering of novel catalytic complexes. The engineering of these enzy.nes 
is achieved by modifying, adding, or deleting domains, or replacing them with those 
taken from other Type I PKS enzymes. It is also achieved by deleting, replacing, or 
adding entire modules with those taken from other sources. A genetically engineered 
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PKS complex should of course have the ability to catalyze the synthesis of the product 
predicted from the genetic alterations made. 

Alignments of the many available amino acid sequences for Type I PKS enzymes 
has approximately defined the boundaries of the various catalytic domains. Sequence 
5 alignments also have revealed linker regions between the catalytic domains and at the N- 
and C-termini of individual polypeptides. The sequences of these linker regions are less 
well conserved than are those for the catalytic domains, which is in part how linker 
regions are identified. Linker regions can be important for proper association between 
domains and between the individual polypeptides that comprise the PKS complex. One 
1 0 can thus view the linkers and domains together as creating a scaffold on which the 
domains and modules are positioned in the correct orientation to be active. This 
organization and positioning, if retained, permits PKS domains of different or identical 
substrate specificities to be substituted (usually at the DNA level) between PKS enzymes 
by various available methodologies. In selecting the boundaries of, for example, an AT 
1 5 replacement, one can thus make the replacement so as to retain the linkers of the 

recipient PKS or to replace them with the linkers of the donor PKS AT domain, or, ~ 
preferably, make both constructs to ensure that the correct linker regions between the KS 
and AT domains have been included in at least one of the engineered enzymes. Thus, 
there is considerable flexibility in the design of new PKS enzymes with the result that 
20 known polyketides can be produced more effectively, and novel polyketides useful as 
pharmaceuticals or for other purposes can be made. 

By appropriate application of recombinant DNA technology, a wide variety of 
polyketides can be prepared in a variety of different host cells provided one has access to 
nucleic acid compounds that encode PKS proteins and polyketide modification enzymes. 
25 The present invention helps meet the need for such nucleic acid compounds by providing 
recombinant vectors that encode the FK-520 PKS enzyme and various FK-520 
modification enzymes.. Moreover, while the FK-506 and FK-520 polyketides have many 
useful activities, there remains a need for compounds with similar useful activities but 
with better pharmacokinetic profile and metabolism and fewer side-effects. The present 
30 invention helps meet the need for such compounds as well. 

Summary of the Invention 
In one embodiment, the present invention provides recombinant DNA vectors 
that encode all or part of the FK-520 PKS enzyme. Illustrative vectors of the invention 
35 include cosmid pKOS034-120, pKOS034-124, pKOS065-C3l, pKOS065-C3, 
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pKOS065-M27, and pKOS065-M21. The invention also provides nucleic acid 
compounds that encode the various domains of the FK-520 PKS, i.e., the KS, AT, ACP, 
KR, DH, and ER domains. These compounds can be readily used, alone or in 
combination with nucleic acids encoding other FK-520 or non-FK-520 PKS domains, as 
5 intermediates in the construction of recombinant vectors that encode all or part of PKS 
enzymes that make novel polyketides. 

The invention also provides isolated nucleic acids that encode all or part of one or 
more modules of the FK-520 PKS, each module comprising a ketosynthase activity, an 
acyl transferase activity, and an acyl carrier protein activity. The invention provides an 
1 0 isolated nucleic acid that encodes one or more open reading frames of FK-520 PKS 

genes, said open reading frames comprising coding sequences for a CoA ligase activity, 
an NRPS activity, or two or more extender modules. The invention also provides 
recombinant expression vectors containing these nucleic acids. 

In another embodiment, the invention provides isolated nucleic acids that encode 
1 5 all or a part of a PKS that contains at least one module in which at least one of the 

domains in the module is a domain from a non-FK-520 PKS and at least one domain is 
from the FK-520 PKS. The non-FK-520 PKS domain or module originates from the 
rapamycin PKS, the FK-506 PKS. DEBS, or another PKS. The invention also provides 
recombinant expression vectors containing these nucleic acids. 
20 In another embodiment, the invention provides a method of preparing a 

polyketide, said method comprising transforming a host cell with a recombinant DNA 
vector that encodes at least one module of a PKS, said module comprising at least oue 
FK-520 PKS domain, and culturing said host cell under conditions such that said PKS is 
produced and catalyzes synthesis of said polyketide. In one aspect, the method is 
25 practiced with a Streptomyces host cell. In another aspect, the polyketide produced is 

FK-520. In another aspect, the polyketide produced is a polyketide related in structure to 
FK-520. In another aspect, the polyketide produced is a polyketide related in structure to 

FK-506 or rapamycin. 

In another embodiment, the invention provides a set of genes in recombinant 
30 form sufficient for the synthesis of ethylmalonyl CoA in a heterologous host cell. These 
genes and the methods of the invention enable one to create recombinant host cells with 
the ability to produce polyketides or other compounds that require ethylmalonyl CoA for 
biosynthesis. The invention also provides recombinant nucleic acids that encode AT 
domains specific for ethylmalonyl CoA. Thus, the compounds of the invention can be 
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used to produce polyketides requiring ethylmalonyl CoA in host cells that otherwise are 
unable to produce such polyketides. 

In another embodiment, the invention provides a set of genes in recombinant 
form sufficient for the synthesis of 2-hydroxymalonyl CoA and 2-methoxymalonyl CoA 
5 in a heterologous host cell. These genes and the methods of the invention enable one to 
create recombinant host cells with the ability to produce polyketides or other compounds 
that require 2-hydroxymalonyl CoA for biosynthesis. The invention also provides 
recombinant nucleic acids that encode AT domains specific for 2-hydroxymalonyl CoA 
and 2-methoxymalonyl CoA. Thus, the compounds of the invention can be used to 
1 0 produce polyketides requiring 2-hydroxymalonyl CoA or 2-methoxymalonyl CoA in 
host cells that are otherwise unable to produce such polyketides. 

In another embodiment, the invention provides a compound related in structure to 
FK-520 or FK-506 that is useful in the treatment of a medical condition. These 
compounds include compounds in which the C-13 methoxy group is replaced by a 
1 5 moiety selected from the group consisting of hydrogen, methyl, and ethyl moieties. Such 
compounds are less susceptible to the main in vivo pathway of degradation for FK-520 
and FK-506 and related compounds and thus exhibit an improved pharmacokinetic 
profile. The compounds of the invention also include compounds in which the C-15 
methoxy group is replaced by a moiety selected from the group consisting of hydrogen, 
20 methyl, and ethyl moieties. The compounds of the invention also include the above 

compounds further modified by chemical methodology to produce derivatives such as, 
but not limited to, the C-18 hydroxyl derivatives, which have potent neurotrophin but not 
immunosuppresion activities. 

Thus, the invention provides polyketides having the structure: 



25 




wherein, R, is hydrogen, methyl, ethyl, or allyl; R 2 is hydrogen or hydroxyl, provided 
that when R 2 is hydrogen, there is a double bond between C-20 and C-19; R 3 is hydrogen 
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or hydroxyl; R4 is methoxyl, hydrogen, methyl, or ethyl; and R 5 is methoxyl, hydrogen, 
methyl, or ethyl; but not including FK-506, FK-520, 18-hydroxy-FK-520, and 18- 
hydroxy-FK-506. The invention provides these compounds in purified form and in 
pharmaceutical compositions. 

5 In another embodiment, the invention provides a method for treating a medical 

condition by administering a pharmaceutically efficacious dose of a compound of the 
invention. The compounds of the invention may be administered to achieve 
immunosuppression or to stimulate nerve growth and regeneration. 

These and other embodiments and aspects of the invention will be more fully 

1 0 understood after consideration of the attached Drawings and their brief description 
below, together with the detailed description, examples, and claims that follow. 

Brief Description of the Drawings 
Figure 1 shows a diagram of the FK-520 biosynthetic gene cluster. The top line 
1 5 provides a scale in kilobase pairs (kb). The second line shows a restriction map with 
selected restriction enzyme recognition sequences indicated. K is Kpnl; X is Xhol, S is 
Sad ; P is Pstl; and E is EcoKl. The third line indicates the position of FK-520 PKS and 
related genes. Genes are abbreviated with a one letter designation, i.e., C is fkbC. 
Immediately under the third line are numbered segments showing where the loading 
20 module (L) and ten different extender modules (numbered 1 - 10) are encoded on the 

various genes shown. At the bottom of the Figure, the DNA inserts of various cosmids of 
the invention (i.e., 34-124 is cosmid pKOS034-124) are shown in alignment with the 
FK-520 biosynthetic gene cluster. 

Figure 2 shows the loading module (load), the ten extender modules, and the 
25 peptide synthetase domain of the FK-520 PKS, together with, on the top line, the genes 
that encode the various domains and modules. Also shown are the various intermediates 
in FK-520 biosynthesis, as well as the structure of FK-520, with carbons 13, 15, 21, and 
3 1 numbered. The various domains of each module and subdomains of the loading 
module are also shown. The darkened circles showing the DH domains in modules 2, 3, 
30 and 4 indicate that the dehydratase domain is not functional as a dehydratase; this 

domain may affect the stereochemistry at the corresponding position in the polyketide. 
The substituents on the FK-520 structure that result from the action of non-PKS enzymes 
are also indicated by arrows, together with the types of enzymes or the genes that code 
for the enzymes that mediate the action. Although the methyltransferase is shown acting 
35 at the C- 1 3 and C- 1 5 hydroxyl groups after release of the polyketide from the PKS, the 
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methyltransferase may act on the 2-hydroxymalonyl substrate prior to or 
contemporaneously with its incorporation during polyketide synthesis. 

Figure 3 shows a close-up view of the left end of the FK-520 gene cluster, which 
contains at least ten additional genes. The ethyl side chain on carbon 21 of FK-520 
5 (Figure 2) is derived from an ethylmalonyl CoA extender unit that is incorporated by an 
ethylmalonyl specific AT domain in extender module 4 of the PKS. At least four of the 
genes in this region code for enzymes involved in ethylmalonyl biosynthesis. The 
polyhydroxybutyrate depolymerase is involved in maintaining hydroxybutyryl-CoA 
pools during FK-520 production. Polyhydroxybutyrate accumulates during vegetative 
10 growth and disappears during stationary phase in other Streptomyces (Ranade and 

Vining, 1993, Can, J. Microbiol 39:377). Open reading frames with unknown function 
are indicated with a question mark. 

Figure 4 shows a biosynthetic pathway for the biosynthesis of ethylmalonyl CoA 
from acetoacetyl CoA consistent with the function assigned to four of the genes in the 
15 FK-520 gene cluster shown in Figure 3. 

Figure 5 shows a close-up view of the right-end of the FK-520 PKS gene cluster 
(and of the sequences on cosmid pKOS065-C31). The genes shown include JkbD,jkbM 
(a methyl transferase that methylates the hydroxyl group on C-31 of FK-520), JkbN (a 
homolog of a gene described as a regulator of cholesterol oxidase and that is believed to 
20 be a transcriptional activator), fkbQ (a type II thioesterase, which can increase polyketide 
production levels), and fkbS (a crotonyl-CoA reductase involved in the biosynthesis of 
ethylmalonyl CoA). 

Figure 6 shows the proposed degradative pathway for tacrolimus (FK-506) 

metabolism. 

25 Figure 7 shows a schematic process for the construction of recombinant PKS 

genes of the invention that encode PKS enzymes that produce 13-desmethoxy FK-506 
and FK-520 polyketides of the invention, as described in Example 4, below. 

Figure 8, in Parts A and B, shows certain compounds of the invention preferred 
for dermal application in Part A and a synthetic route for making those compounds in 

30 Part B. 

Detailed Description of the Invention 
Given the valuable pharmaceutical properties of polyketides, there is a need for 
methods and reagents for producing large quantities of polyketides, as well as for 
35 producing related compounds not found in nature. The present invention provides such 
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methods and reagents, with particular application to methods and reagents for producing 
the polyketides known as FK-520, also known as ascomycin or L-683,590 (see Holt et 
a/., 1993, JACS 1 75:9925), and FK-506, also known as tacrolimus. Tacrolimus is a 
macroiide immunosuppressant used to prevent or treat rejection of transplanted heart, 
5 kidney, liver, lung, pancreas, and small bowel allografts. The drug is also useful for the 
prevention and treatment of graft-versus-host disease in patients receiving bone marrow 
transplants, and for the treatment of severe, refractory uveitis. There have been additional 
reports of the unapproved use of tacrolimus for other conditions, including alopecia 
universalis, autoimmune chronic active hepatitis, inflammatory bowel disease, multiple 
10 sclerosis, primary biliary cirrhosis, and scleroderma. The invention provides methods 
and reagents for making novel polyketides related in structure to FK-520 and FK-506, 
and structurally related polyketides such as rapamycin. 

The FK-506 and rapamycin polyketides are potent immunosuppressants, with 
chemical structures shown below. 




FK-506 Rapamycin 

15 

FK-520 differs from FK-506 in that it lacks the allyl group at C-21 of FK-506, having 
instead an ethyl group at that position, and has similar activity to FK-506, albeit reduced 
immunosuppressive activity. 

These compounds act through initial formation of an intermediate complex with 

20 protein "iminunophilins" known as FKBPs (FK-506 binding proteins), including FKBP- 
12. Immunophilins are a class of cytosoiic proteins that form complexes with molecules 
such as FK-506, FK-520, and rapamycin that in turn serve as ligands for other cellular 
targets involved in signal transduction. Binding of FK-506, FK-520, and rapamycin to 
FKBP occurs through the structurally similar segments of the polyketide molecules, 

25 known as the "FKBP-binding domain" (as generally but not precisely indicated by the 

10 
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stippled regions in the structures above). The FK-506-FKBP complex then binds 
calcineurin, while the rapamycin-FKBP complex binds to a protein known as RAFT- 1 . 
Binding of the FKBP-polyketide complex to these second proteins occurs through the 
dissimilar regions of the drugs known as the "effector" domains. 




^ > ■ lmmunosi^>pr999lon 



5 




The three component FKBP-polyketide-effector complex is required for signal 
transduction and subsequent immunosuppressive activity of FK-506, FK-520, and 
rapamycin. Modifications in the effector domains of FK-506, FK-520, and rapamycin 
that destroy binding to the effector proteins (calcineurin or RAFT) lead to loss of 

10 immunosuppressive activity, even though FKBP binding is unaffected. Further, such 
analogs antagonize the immunosuppressive effects of the parent polyketides, because 
they compete for FKBP. Such non-immunosuppressive analogs also show reduced 
toxicity (see Dumont et al. y 1992, Journal of Experimental Medicine 1 76, 75 1-760), 
indicating that much of the toxicity of these drugs is not linked to FKBP binding. 

1 5 In addition to immunosuppressive activity, FK-520, FK-506, and rapamycin have 

neurotrophic activity. In the central nervous system and in peripheral nerves, 
immunophilins are referred to as "neuroimmunophilins". The neuroimmunophilin FKBP 
is markedly enriched in the central nervous system and in peripheral nerves. Molecules 
that bind to the neuroimmunophilin FKBP, such as FK-506 and FK-520, have the 

20 remarkable effect of stimulating nerve growth. In vitro, they act as neurotrophins, i.e., 
they promote neurite outgrowth in NGF- treated PC 12 cells and in sensory neuronal 
cultures, and in intact animals, they promote regrowth of damaged facial and sciatic 
nerves, and repair lesioned serotonin and dopamine neurons in the brain. See Gold et aL, 
Jun. 1999, J. Pharm. Exp. Ther. 289(3): 1202-1210; Lyons et a/., 1994, Proc. National 

25 Academy of Science 91: 3191-3195; Gold et al., 1995, Journal of Neuroscience 15: 
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7509-7516; and Steiner et al., 1997, Proc. National Academy of Science 94: 2019-2024. 
Further, the restored central and peripheral neurons appear to be functional. 

Compared to protein neurotrophic molecules (BNDF, NGF, etc.), the small- 
molecule neurotrophins such as FK-506, FK-520, and rapamycin have different, and 
5 often advantageous, properties. First, whereas protein neurotrophins are difficult to 

deliver to their intended site of action and may require intra-cranial injection, the small- 
molecule neurotrophins display excellent bioavailability; they are active when 
administered subcutaneously and orally. Second, whereas protein neurotrophins show 
quite specific effects, the small-molecule neurotrophins show rather broad effects. 

10 Finally, whereas protein neurotrophins often show effects on normal sensory nerves, the 
small-molecule neurotrophins do not induce aberrant sprouting of normal neuronal 
processes and seem to affect damaged nerves specifically. Neuroimmunophilin ligands 
have potential therapeutic utility in a variety of disorders involving nerve degeneration 
(e.g. multiple sclerosis, Parkinson's disease, Alzheimer's disease, stroke, traumatic 

1 5 spinal cord and brain injury, peripheral neuropathies). 

Recent studies have shown that the immunosuppressive and neurite outgrowth 
activity of FK-506, FK-520, and rapamycin can be separated; the neuroregenerative 
activity in the absence of immunosuppressive activity is retained by agents which bind to 
FKBP but not to the effector proteins calcineurin or RAFT. See Steiner et aL % 1997, 

20 . Nature Medicine 3: 42 1 -428. 




Nsrvo i ngsnQfaxion 



Available structure-activity data show that the important features for neurotrophic 
activity of rapamycin, FK-520, and FK-506 lie within the common, contiguous segments 
of the macrolide ring that bind to FKBP. This portion of the molecule is termed the 
25 "FKBP binding domain" (see VanDuyne et a/., 1993, Journal of Molecular Biology 229: 
105-124.). Nevertheless, the effector domains of the parent macrolides contribute to 
conformational rigidity of the binding domain and thus indirectly contribute to FKBP 
binding. 
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"FKBP binding domain- 
There are a number of other reported analogs of FK-506, FK-520, and rapamycin that 
bind to FKBP but not the effector protein caicineurin or RAFT. These analogs show 
effects on nerve regeneration without immunosuppressive effects. 
5 Naturally occurring FK-520 and FK-506 analogs include the antascomycins, 

which are FK-506-like macrolides that lack the functional groups of FK-506 that bind to 
caicineurin (see Fehr et aL % 1996, The Journal of Antibiotics 49: 230-233). These 
molecules bind FKBP as effectively as does FK-506; they antagonize the effects of both 
FK-506 and rapamycin, yet lack immunosuppressive activity. 




Other analogs can be produced by chemically modifying FK-506, FK-520, or 
rapamycin. One approach to obtaining neuroimmunophilin ligands is to destroy the 
effector binding region of FK-506, FK-520, or rapamycin by chemical modification. 
While the chemical modifications permitted on the parent compounds are quite limited, 



15 some useful chemically modified analogs exist. The FK-520 analog L-685,818 (ED 50 = 
0.7 nM for FKBP binding; see Dumont et a/., 1992), and the rapamycin analog WAY- 
124,466 (IC50 = 12.5 nM; see Ocain et al. 9 1993, Biochemistry Biophysical Research 
Communications 192: 1340-134693) are about as effective as FK-506, FK-520, and 
rapamycin at promoting neurite outgrowth in sensory neurons (see Steiner et aL, 1997). 

13 
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L-685.818 WAY-124,466 



One of the few positions of rapamycin that is readily amenable to chemical 
modification is the allylic 16-methoxy group; this reactive group is readily exchanged by 
acid-catalyzed nucleophilic substitution. Replacement of the 16-methoxy group of 
5 rapamycin with a variety of bulky groups has produced analogs showing selective loss of 
immunosuppressive activity while retaining FKBP-bindirig (see Luengo et al., 1995, 
Chemistry & Biology 2: 471-481). One of the best compounds, 1, below, shows complete 
loss of activity in the splenocyte proliferation assay with only a 10-fold reduction in 
binding to FKBP. 




1 



There are also synthetic analogs of FKBP binding domains. These compounds 
reflect an approach to obtaining neuroimmunophilin ligands based on "rationally 
designed" molecules that retain the FKBP-binding region in an appropriate conformation 
1 5 for binding to FKBP, but do not possess the effector binding regions. In one example, the 
ends of the FKBP binding domain were tethered by hydrocarbon chains (see Holt et a/., 
1993, Journal of the American Chemical Society US: 9925-9938); the best analog, 2, 
below, binds to FKBP about as well as FK-506. In a similar approach, the ends of the 
FKBP binding domain were tethered by a tripeptide to give analog 3, below, which binds 
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to FKBP about 20-fold poorer than FK-506. These compounds are anticipated to have 
neuroimmunophilin binding activity. 




2 3 



5 In a primate MPTP model of Parkinson's disease, administration of FKBP ligand 

GPI-1046 caused brain cells to regenerate and behavioral measures to improve. MPTP is 
a neurotoxin, which, when administered to animals, selectively damages nigral-striatal 
dopamine neurons in the brain, mimicking the damage caused by Parkinson's disease. 
Whereas, before treatment, animals were unable to use affected limbs, the FKBP ligand 

10 restored the ability of animals to feed themselves and gave improvements in measures of 
locomotor activity, neurological outcome, and fine motor control. There were also 
corresponding increases in regrowth of damaged nerve terminals. These results 
demonstrate the utility of FKBP ligands for treatment of diseases of the CNS. 

From the above description, two general approaches towards the design of non- 

1 5 immunosuppressant, neuroimmunophilin ligands can be seen. The first involves the 

construction of constrained cyclic analogs of FK-506 in which the FKBP binding domain 
is fixed in a conformation optimal for binding to FKBP. The advantages of this approach 
are that the conformation of the analogs can be accurately modeled and predicted by 
computational methods, and the analogs closely resemble parent molecules that have 

20 proven pharmacological properties. A disadvantage is that the difficult chemistry limits 
the numbers and types of compounds that can be prepared. The second approach 
involves the trial and error construction of acyclic analogs of the FKBP binding domain 
by conventional medicinal chemistry. The advantages to this approach are that the 
chemistry is suitable for production of the numerous compounds needed for such 

25 interactive chemistry-bioassay approaches. The disadvantages are that the molecular 
types of compounds that have emerged have no known history of appropriate 
pharmacological properties, have rather labile ester functional groups, and are too 
conformationally mobile to allow accurate prediction of conformational properties. 

The present invention provides useful methods and reagents related to the first 

30 approach, but with significant advantages. The invention provides recombinant PKS 

15 



BNSDOCID: <WO 0020601 A2 t > 



WO 00/20601 PCT/US99/22886 
genes that produce a wide variety of polyketides that cannot otherwise be readily 
synthesized by chemical methodology alone. Moreover, the present invention provides 
polyketides that have either or both of the desired immunosuppressive and neurotrophic 
activities, some of which are produced only by fermentation and others of which are 
5 produced by fermentation and chemical modification. Thus, in one aspect, the invention 
provides compounds that optimally bind to FKBP but do not bind to the effector 
proteins. The methods and reagents of the invention can be used to prepare numerous 
constrained cyclic analogs of FK-520 in which the FKBP binding domain is fixed in a 
conformation optimal for binding to FKBP. Such compounds will show 

10 neuroimrnunophilin binding (neurotrophic) but not immunosuppressive effects. The 

invention also allows direct manipulation of FK-520 and related chemical structures via 
genetic engineenng of the enzymes involved in the biosynthesis of FK-520 (as well as 
related compounds, such as FK-506 and rapamycin); similar chemical modifications are 
simply not possible because of the complexity of the structures. The invention can also 

1 5 be used to introduce "chemical handles" into normally inert positions that permit 
subsequent chemical modifications. 

Several general approaches to achieve the development of novel 
neuroimrnunophilin ligands are facilitated by the methods and reagents of the present 
invention. One approach is to make "point mutations" of the functional groups of the 

20 parent FK-520 structure that bind to the effector molecules to eliminate their binding 
potential. These types of structural modifications are difficult to perform by chemical 
modification, but can be readily accomplished with the methods and reagents of the 
invention. 

A second, more extensive approach facilitated by the present invention is to 
25 utilize molecular modeling to predict optimal structures ab initio that bind to FKBP but 
not effector molecules. Using the available X-ray crystal structure of FK-520 (or FK- 
506) bound to FKBP, molecular modeling can be used to predict polyketides that should 
optimally bind to FKBP but not calcineurin. Various macrolide structures can be 
generated by linking the ends of the FKBP-binding domain with "all possible" 
30 polyketide chains of variable length and substitution patterns that can be prepared by 

genetic manipulation of the FK-520 or FK-506 PKS gene cluster in accordance with the 
methods of the invention. The ground state conformations of the virtual library can 0e 
determined, and compounds that possess binding domains most likely to bind well to 
FKBP can be prepared and tested. 
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Once a compound is identified in accordance with the above approaches, the 
invention can be used to generate a focused library of analogs around the lead candidate, 
to "fine tune" the compound for optimal properties. Finally, the genetic engineering 
methods of the invention can be directed towards producing "chemical handles" that 
5 enable medicinal chemists to modify positions of the molecule previously inert to 
chemical modification. This opens the path to previously prohibited chemical 
optimization of lead compounds by time-proven approaches. 

Moreover, the present invention provides polyketide compounds and the 
recombinant genes for the PKS enzymes that produce the compounds that have 

10 significant advantages over FK-506 and FK-520 and their analogs. The metabolism and 
pharmacokinetics of tacrolimus has been exstensively studied, and FK-520 is believed to 
be similar in these respects. Absorption of tacrolimus is rapid, variable, and incomplete 
from the gastrointestinal tract (Harrison's Principles of Internal Medicine, 14th edition, 
1998, McGraw Hill, 14, 20, 21, 64-67). The mean bioavailability of the oral dosage form 

1 5 is 27%, (range 5 to 65%). The volume of distribution (VolD) based on plasma is 5 to 65 
L per kg of body weight (L/kg), and is much higher than the VolD based on whole blood 
concentrations, the difference reflecting the binding of tacrolimus to red blood cells. 
Whole blood concentrations may be 12 to 67 times the plasma concentrations. Protein 
binding is high (75 to 99%), primarily to albumin and alphal-acid glycoprotein. The 

20 half-life for distribution is 0.9 hour; elimination is biphasic and variable: terminal- 1 1.3 hr 
(range, 3.5 to 40.5 hours). The time to peak concentration is 0.5 to 4 hours after oral 
administration. 

Tacrolimus is metabolized primarily by cytochrome P450 3 A enzymes in the"' 
liver and small intestine. The drug is extensively metabolized with less than 1% excreted 

25 unchanged in urine. Because hepatic dysfunction decreases clearance of tacrolimus, 
doses have to be reduced substantially in primary graft non-function, especially in 
children. In addition, drugs that induce the cytochrome P450 3A enzymes reduce 
tacrolimus levels, while drugs that inhibit these P450s increase tacrolimus levels. 
Tacrolimus bioavailability doubles with co-administration of ketoconazole, a drug that 

30 inhibits P450 3 A. See, Vincent et aL % 1992, In vitro metabolism of FK-506 in rat, rabbit, 
and human liver microsomes: Identification of a major metabolite and of cytochrome 
P450 3A as the major enzymes responsible for its metabolism, Arch. Biochem. Biophys. 
294: 454-460; Iwasaki et al„ 1993, Isolation, identification, and biological activities of 
oxidative metabolites of FK-506, a potent immunosuppressive macrolide lactone, Drug 

35 Metabolism & Disposition 21: 971-977; Shiraga et al, 1994, Metabolism of FK-506, a 
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potent immunosuppressive agent, by cytochrome P450 3A enzymes in rat, dog, and 
human liver microsomes, Biochem. Pharmacol. 47: 727-735; and Iwasaki et a/., 1995, 
Further metabolism of FK-506 (Tacrolimus); Identification and biological activities of 
the metabolites oxidized at multiple sites of FK-506, Drug Metabolism & Disposition 23: 
5 28-34. The cytochrome P450 3 A subfamily of isozymes has been implicated as 
important in this degradative process. 

Structures of the eight isolated metabolites formed by liver microsomes are 
shown in Figure 6. Four metabolites of FK-506 involve demethylation of the oxygens on 
carbons 13, 15, and 31, and hydroxylation of carbon 12. The 13-demethylated (hydroxy) 
10 compounds undergo cyclizations of the 13-hydroxy at C-10 to give MI, MVI and MVII, 
and the 12-hydroxy metabolite at C-10 to give I. Another four metabolites formed by 
oxidation of the four metabolites mentioned above were isolated by liver microsomes 
from dexamethasone treated rats. Three of these are metabolites doubly demethylated at 
the methoxy groups on carbons 15 and 31 (M-V), 13 and 31 (M-VI), and 13 and 15 (M- 
15 VII). The fourth, M-VIII, was the metabolite produced after demethylation of the 31- 
methoxy group, followed by formation of a fused ring system by further oxidation. 
Among the eight metabolites, M-II has immunosuppressive activity comparable to that 
of FK-506, whereas the other metabolites exhibit weak or negligible activities. 
Importantly, the major metabolite of human, dog, and rat liver microsomes is the 13- 
20 demethylated and cyclized FK-506 (M-I). 

Thus, the major metabolism of FK-506 proceeds via 13 -demethylation followed 
by cyclization to the inactive M-I, this representing about 90% of the metabolic products 
after a 10 minute incubation with liver microsomes. Analogs of tacrolimus that do not 
possess a C-13 methoxy group would not be susceptible to the first and most important 
25 biotransformation in the destructive metabolism of tacrolimus (i.e. cyclization of 13- 
hydroxy to C-10). Thus, a 13-desmethoxy analog of FK-506 should have a longer half- 
life in the body than does FK-506. The C-13 methoxy group is believed not to be 
required for binding to FKBP or calcineurin. The C-13 methoxy is not present on the 
identical position of rapamycin, which binds to FKBP with equipotent affinity as 
30 tacrolimus. Also, analysis of the 3-dimensional structure of the FKBP-tacrolimus- 

calcineurin complex shows that the C-13 methoxy has no interaction with FKBP and 
only a minor interaction with calcineurin. The present invention provides C- 13- 
desmethoxy analogs of FK-506 and FK-520, as well as the recombinant genes that 
encode the PKS enzymes that catalyze their synthesis and host cells that produce the 
35 compounds. 
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These compounds exhibit, relative to their naturally occurring counterparts, 
prolonged immunosuppressive action in vivo, thereby allowing a lower dosage and/or 
reduced frequency of administration. Dosing is more predictable, because the variability 
in FK-506 dosage is largely due to variation of metabolism rate. FK-506 levels in blood 
5 can vary widely depending on interactions with drugs that induce or inhibit cytochrome 
P450 3A (summarized in USP Drug Information for the Health Care Professional). Of 
particular importance are the numerous drugs that inhibit or compete for CYP 3 A, 
because they increase FK-506 blood levels and lead to toxicity (Prograf package insert, 
FujisawaGUS, Rev 4/97, Rec 6/97). Also important are the drugs that induce P450 3 A 

1 0 (e.g. Dexamethasone), because they decrease FK-506 blood levels and reduce efficacy. 
Because the major site of CYP 3 A action on FK-506 is removed in the analogs provided 
by the present invention, those analogs are not as susceptible to drug interactions as the 
naturally occurring compounds. 

Hyperglycemia, nephrotoxicity, and neurotoxicity are the most significant 

1 5 adverse effects resulting from the use of FK-506 and are believed to be similar for FK- 
520. Because these effects appear to occur primarily by the same mechanism as the 
immunosuppressive action (he. FKBP-calcineurin interaction), the intrinsic toxicity of 
the desmethoxy analogs may be similar to FK-506. However, toxicity of FK-506 is dose 
related and correlates with high blood levels of the drug (Prograf package insert, 

20 FujisawaGUS, Rev 4/97. Rec 6/97). Because the levels of the compounds provided by 
the present invention should be more controllable, the incidence of toxicity should be 
significantly decreased with the 13-desmethoxy analogs. Some reports show that certain 
FK-506 metabolites are more toxic than FK-506 itself, and this provides an additional 
reason to expect that a CYP 3 A resistant analog can have lower toxicity and a higher 

25 therapeutic index. 

Thus, the present invention provides novel compounds related in structure to FK- 
506 and FK-520 but with improved properties. The invention also provides methods for 
making these compounds by fermentation of recombinant host cells, as well as the 
recombinant host cells, the recombinant vectors in those host cells, and the recombinant 

30 proteins encoded by those vectors. The present invention also provides other valuable 
materials useful in the construction of these recombinant vectors that have many other 
important applications as well. In particular, the present invention provides the FK-520 
PKS genes, as well as certain genes involved in the biosynthesis of FK-520 in 
recombinant form. 
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FK-520 is produced at relatively low levels in the naturally occurring cells, 
Streptomyces hygroscopicus var. ascomyceticus, in which it was first identified. Thus, 
another benefit provided by the recombinant FK-520 PKS and related genes of the 
present invention is the ability to produce FK-520 in greater quantities in the 
5 recombinant host cells provided by the invention. The invention also provides methods 
for making novel FK-520 analogs, in addition to the desmethoxy analogs described 
above, and derivatives in recombinant host cells of any origin. 

The biosynthesis of FK-520 involves the action of several enzymes. The FK-520 
PKS enzyme, which is composed of the fkbA,jkbB,jkbC, and JkbP gene products, 

10 synthesizes the core structure of the molecule. There is also a hydroxylation at C-9 

mediated by the P450 hydroxylase that is the fkbD gene product and that is oxidized by 
the fkhO gene product to result in the formation of a keto group at C-9. There is also a 
methylation at C-31 that is mediated by an O-methyltransferase that is the/*AA/gen^ 
product. There are also methylations at the C- 13 and C-15 positions by a 

15 methyltransferase believed to be encoded by the fkbG gene; this methyltransferase may 
act on the hydroxymalonyl Co A substrates prior to binding of the substrate to the AT 
domains of the PKS during polyketide synthesis. The present invention provides the 
genes encoding these enzymes in recombinant form. The invention also provides the 
genes encoding the enzymes involved in ethylmalonyl CoA and 2-hydroxymalonyl CoA 

20 biosynthesis in recombinant form. Moreover, the invention provides Streptomyces 

hygroscopicus var. ascomyceticus recombinant host cells lacking one or more of these 
genes that are useful in the production of useful compounds. 

The cells are useful in production in a variety of ways. First, certain cells make a 
useful FK-520-related compound merely as a result of inacti vation of one or more of the 

25 FK-520 biosynthesis genes. Thus, by inactivating the C-31 O-methyltransferase gene in 
Streptomyces hygroscopicus var. ascomyceticus , one creates a host cell that makes a 
desmethyl (at C-3 1 ) derivative of FK-520. Second, other cells of the invention are unable 
to make FK-520 or FK-520 related compounds due to an inactivation of one or more of 
the PKS genes. These cells are useful in the production of other polyketides produced by 

30 PKS enzymes that are encoded on recombinant expression vectors and introduced into 
the host ceil. 

Moreover, if only one PKS gene is inactivated, the ability to produce FK-520 or 
an FK-520 derivative compound is restored by introduction of a recombinant expression 
vector that contains the functional gene in a modified or unmodified form. The 
35 introduced gene produces a gene product that, together with the other endogenous and 

20 
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functional gene products, produces the desired compound. This methodology enables 
one to produce FK-520 derivative compounds without requiring that all of the genes for 
the PKS enzyme be present on one or more expression vectors. Additional applications 
and benefits of such cells and methodology will be readily apparent to those of skill in 
5 the art after consideration of how the recombinant genes were isolated and employed in 
the construction of the compounds of the invention. 

The FK-520 biosynthetic genes were isolated by the following procedure. 
Genomic DNA was isolated from Streptomyces hygroscopicus var. ascomyceticus 
(ATCC 14891) using the lysozyme/proteinase K protocol described in Genetic 

10 Manipulation of Streptomyces - A Laboratory Manual (Hopwood et al. y 1986). The 

average size of the DNA was estimated to be between 80 - 120 kb by electrophoresis on 
0.3% agarose gels. A library was constructed in the SuperCos™ vector according to the 
manufacturer's instructions and with the reagents provided in the commercially available 
kit (Stratagene). Briefly, 100 |ig of genomic DNA was partially digested with 4 units of 

15 SauJA I for 20 min. in a reaction volume of 1 mL, and the fragments were 

dephosphorylated and ligated to SuperCos vector arms. The ligated DNA was packaged 
and used to infect log-stage XLl-BlueMR cells. A library of about 10,000 independent 
cosmid clones was obtained. 

Based on recently published sequence from the FK-506 cluster (Motamedi and 

20 Shafiee, 1 998, Eur. J. Biochem. 256: 528), a probe for the JkbO gene was isolated from 
ATCC 14891 using PCR with degenerate primers. With this probe, a cosmid designated 
pKOS034-124 was isolated from the library. With probes made from the ends of cosmid 
pKOS034-124, an additional cosmid designated pKOS034-120 was isolated. These 
cosmids (pKOS034-124 and pKOS,034-120) were shown to contain DNA inserts thrt 

25 overlap with one another. Initial sequence data from these two cosmids generated 

sequences similar to sequences from the FK-506 and rapamycin clusters, indicating that 
the inserts were from the FK-520 PKS gene cluster. Two EcoBl fragments were 
subcloned from cosmids pKOS034-124 and pKOS034-120. These subclones were used 
to prepare shotgun libraries by partial digestion with Sau3 AI, gel purification of 

30 fragments between 1.5 kb and 3 kb in size, and ligation into the pLitmus28 vector (New 
England Biolabs). These libraries were sequenced using dye terminators on a Beckmann 
CEQ2000 capillary electrophoresis sequencer, according to the manufacturer's protocols. 

To obtain cosmids containing sequence on the left and right sides of the 
sequenced region described above, a new cosmid library of ATCC 14891 DNA was 

35 prepared essentially as described above. This new library was screened with a new JkbM 
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probe isolated using DNA from ATCC 14891. A probe representing the fkbP gene at the 
end of cosmid pKOS034-124 was also used. Several additional cosmids to the right of 
the previously sequenced region were identified. Cosmids pKOS065-C31 and pKOS065- 
C3 were identified and then mapped with restriction enzymes. Initial sequences from 
5 these cosmids were consistent with the expected organization of the cluster in this 

region. More extensive sequencing showed that both cosmids contained in addition to the 
desired sequences, other sequences not contiguous to the desired sequences on the host 
cell chromosomal DNA. Probing of additional cosmid libraries identified two additional 
cosmids, pKOS065-M27 and pKOS065-M21, that contained the desired sequences in a 

10 contiguous segment of chromosomal DNA. Cosmids pKOS034-124, pKOS034-120, 
pKOS065-M27, and pKOS065-M21 have been deposited with the American Type 
Culture Collection, Manassas, VA, USA. The complete nucleotide sequence of the 
coding sequences of the genes that encode the proteins of the FK-520 PKS are shown 
below but can also be determined from the cosmids of the invention deposited with the 

1 5 ATCC using standard methodology. 

Referring to Figures 1 and 3, the FK-520 PKS gene cluster is composed of four 
open reading frames designated JkbB,fkbC,JkbA, and fkbP. The JkbB open reading frame 
encodes the loading module and the first four extender modules of the PKS. The JkbC 
open reading frame encodes extender modules five and six of the PKS. The fkbA open 

20 reading frame encodes extender modules seven, eight, nine, and ten of the PKS. The 
JkbP open reading frame encodes the NRPS of the PKS. Each of these genes can be 
isolated from the cosmids of the invention described above. The DNA sequences of these 
genes are provided below preceded by the following table identifying the start and stop 
codons of the open reading frames of each gene and the modules and domains contained 

25 therein. 



Nucleotides Gene or Domain 

complement (412 - 1836) JkbW 

complement (2020 - 3579) JkbV 

30 complement (3969 - 4496) JkbR2 

complement (4595 - 5488) JkbRl 

5601 -6818 fkbE 

6808 - 8052 JkbF 

8156- 8824 jkbG 

35 complement (9122 - 9883) JkbH 

complement (9894 - 1 0994) fkbl 

complement ( 1 0987 - 1 1 247) jkbJ 

complement ( 1 1 244 - 1 2092) fkbK 

complement (121 13 - 13150) JkbL 

40 complement (13212 - 23988) fkbC 
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complement (23992 


- 46573) 


fkbB 




46754 - 47788 




fkbO 




47785 - 52272 




fkbP 




52275 - 71465 




fkbA 


5 


71462 - 72628 




JkbD 




72625 - 73407 




JkbM 




complement (73460 


- 76202) 


JkbN 




complement (76336 
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\ GATCTCAGGC ATGAAGTCCT CCAGGCGAGG CGCCGAGGTG GTGAACACCT CGCCGCTGCT 
61 TGTACGGACC ACTTCAGTCA GCGGCGATTG CGGAACCAAG TCATCCGGAA TAAAGGGCGG 
121 TTACAAGATC CTCACATTGC GCGACCGCCA GCATACGCTG AGTTGCCTCA GAGGCAAACC 
15 181 GAAAGGGCGC GGGCGGTCCG CACCAGGGCG GAGTACGCGA CGAGAGTGGC GCACCCGCGC 

24 1 ACCGTCACCT CTCTCCCCCG CCGGCGGGAT GCCCGGCGTG ACACGGTTGG GCTCTCCTCG 
301 ACGCTGAACA CCCGCGCGGT GTGGCGTCGG GGACACCGCC TGGCATCGGC CGGGTGACGG 
361 TACGGGGAGG GCGTACGGCG GCCGTGGCTC GTGCTCACGG CCGCCGGGCG GTCATCCGTC 
4 21 GAGACGGCAC TCGGCGAGCA GGGACGCCTG GTCGGCACCT GCGGGCCGGA CGACCG7GTG 
20 481 GTTCGCGGGC GGGCGGTGGC CGGTGGTGAG CCAGCTCTCC AGGGCGGTGA AGGCTGAGCG 

54 1 GTGACACGGC AGCAAAGGCC GGAGTCGGTC GGGGAAGGTG TCGACGAGGG CGTCGGTGTG 
601 CGTGCCGTCC TCGATGCGGT AGTAGCGGTA CCGGCCGCCA GGCCGCTGCC GGACATACGC 
661 GCGTACACGT CGGAGCCCGG GCGGCAGGCA GCAGCACGTC GAGAGTGCCT GGATGGTGAT 
721 CAGCGGCTTG CCGATACGAC CGGTCAACGC GATGCGTTCC ACGGCCGCGT GGACGC^GGA 
25 781 GGAGCGGGTG GCGTAGTCGT AGTCGGCATC GCAGCCCGGG ACCGTCCCCG GGGCGCAATA 

84 1 CGGTGTGCCG GCTTCCTTCT CCCCATCGAA GCCGGGGTCG AACTCCTCGC GGTAGACGCG 
901 CTGCGTCAGA TCCCAGTAGA CCTCGTGGTG GTACGGCCAC AAGAACTCGG AGTCGGCCGG 
961 GAACCCGGCG CGGAGCAGCG CCTCGCGCGC CTGGCCGGCT GCGGGGCCGC CTGCCGCGTA 
1021 GGTGGGGTAG TCGCGCAGGG CGGCCGGCAG GAAGGTGAAG AGGTTGGGAC CCTCCGCGCG 
30 1081 CCACAGGGTG CCTTCCCAGT CGACTCCTCC GTCGTACAGC TCGGGATGGT TCTCCAGCTG 

114 1 CCAGCGCACG AGGTAGCCGC CGTTGGACAT CCCGGTGACC AGGGTGCGCT CGAGCGGCCG 
1201 GTGGTAGCGC TGGGCGACCG ACGCGCGGGC GGCCCGGGTC AGCTGGGTGA GGCGGGTGTT 
12 61 CCACTCGGCG ACGGCGTCGC CCGGCCGGGA GCCATCACGG TAGAACGCGG GGCCGGTGTT 
1321 GCCCTTGTCG GTGGCGGCGT AGGCGTAACC GCGGGCGAGC ACCCAGTCGG CGATGGCCCG 
35 1381 GTCGTTGGCG TACTGCTCGC GGTTACCGGG GGTGCCGGCC ACGACCAGGC CACCGTTCCA 

14 4 1 GCGGTCGGGC AGCCGGATGA CGAACTGGGC GTCGTGGTTC CACCCGTGGT TGGTGTTGGT 
1501 GGTGGAGGTG TCGGGGAAGT AGCCGTCGAT CTGGATCCCG GGCACTCCGG TGGGAGTGGC 

15 61 CAGGTTCTTG GGCGTCAGCC CTGCCCAGTC CGCCGGGTCG GTGTGGCCGG TGGCCGCCGT 
1621 TCCCGCCGtG GTCAGCTCGT CCAGGCAGTC GGCCTGCTGA CGTGCCGCCG CCGGGACACG 

40 1681 CAGCTGGGAC AGACGGGCGC ^GTGACCGTC CGGGGCATCG GGAGCAGGCC GGGCCGTGGC 

174 1 CGGTGAGGGG AGCAGGACGG CGACTGCGGC CAGGGTGAGA GCGCCGAGGC CGGTGCGTCT 
1801 TCTCGGGGCC CGTCCGACAC CGAGGGGCAG AACCATGGAG AGCCTCCAGA CGTGCGGATG 
18 61 GATGACGGAC TGGAGGCTAG GTCGCGCACG GTGGAGACGA ACATGGGTGC GCCCGCCATG 
1921 ACTGAGGCCC CTCAGAGGTG GGCCGCCGCC ATGACGGGCG CGGGACCGCG GGCGCTCCGG 

45 1981 GGCGGTGCCC GCGGCCGCCA CCGGTTCCGG GTCCCCGGGT CAGGGACAGG TGTCGTTCGC 

204 1 GACGGTGAAG TAGCCGGTCG GCGACTCTTT CAAGGTGGTC GTGACGAAGG TGTTGTACAG 
2101 GCCCATGTTC TGGCCGGAGC CCTTGGCGTA GGTGTAACCG GCGCTCGTCG TGGCGCG.GCC 
2161 CGCCTGGACG TGAGCGTAGT TGCCGGCGGT CCAGCAGACG GCCGTGGCAC CGGTCGTCTG 
22 21 CGCGGTGACC GCGCCCGAGA GCGGTCCGGC CTTGCCGTCC GCGTCCCGGG CGGCGACCGC 

50 228 \ GTAGGTGTGC GATGTGCCCG CCCTCAGGCC GGTGTCCGTG TACGACGTCG TGGCGGACGT 
2 34 1 GGTGATCTGG GCACCGTCGC GGTGGACGGC GTAGTCGGTG GCGCCGTCGA CGGGTTTCCA 
24 01 GGTCAGGCTG ATGGTGGTGT CGGTGGCGCC GGTGGCGGCC AGGCCGGACG GAGCGGGCAG 
24 61 CGAACCGGGG TCGGAGGCGG ATCCGCTCAG GCCGAAGAAC TGCGTGATCC AGTAGCTGGA 
2 521 ACAGATCGAG TCCAGGAAGT AGGCGGCGCC GGTGCTGCCG CACTGCTGTG CTCCGGTGCC 

55 2 581 GGGATCGACC GGGGTGCCGT GCCeGATGCC CGGCACCCGG TTCACCTCCA CGGCCACCGA 

2 64 1 TCCGTCCGCG GCCAGGTACT CCTCGTGCCG GGTGGAGTTC GGGCCGATCA CCGAGGTACG 
27 01 GTCCGGCGTC TGGGACACGC CGTGCACAGC GGTCCACTGG TCGCGCAACT CGTCGGCGTT 

27 61 GCGCGGCGCG ACGGTGGTGT CCTTGTCGCC GTGCCAGATG GCCACGCGCG GCCACGGGCC 

28 21 CGACCACGAG GGGTAGCCGT CACGGACCCG CCGCGCCCAC TGGTCCGCGG TCAGGTCGGT 
60 28 81 CCCGGGGTTC ATGCACAGGT ACGCGCTGCT GACGTCGGTG GCACAGCCGA AGGGCAGGCC 

2 94 1 GGCGACGACC GCGCCGGCCT GGAAGACGTC CGGATAGGTG GCGAGCATCA CCGACGTCAT 
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3001 GGCACCGCCG GCGGACAGCC 
30 61 GACGGTGTGA GCGGCCATCT 
3121 GCTGCTCTGG AACCAGTTGA 
3181 CACGAGCAGG AAGCCATAGC 
5 324 1 CTGGGCGTCC TGGGTGCAAC 

3 301 CGCGGGCCGG TAGACGTACA 
3361 GGTCAGGTCC GCCTTGGTCA 
34 21 CGCCGGGCCG AGCAGGGCCG 
34 81 CACCCCCCGC CGTCCCGGAC 

10 35 41 CAGCGGGGTG AGGATTCCCC 

3601 GGGGGG AC AC GGAGGGCTCC 
3661 TAGGGGTGGT TCAACCCGCA 
3721 TGCGCCCGGA CGGATTGTGT 
37 81 ACCCGACACG GGTAGGGCGT 

15 384 1 ACGGACCGGG CGTCGGCGGA 

3901 CCAGCCGCGT GGGGCGGCCG 
3961 CGGACCGGTC AGTGCAGTCC 

4 021 GCGGCGAACC GGGGTCCGTG 
4 081 ACGATGACAC CGTCCTGGTT 

20 4 141 CGGCTGGCGG ACTCCCGGGT 

4 201 AAGACCGGGT TCGGCAGCCT 
4 2 61 ATGTCGGTGA CGCTCTGCCC 
4 321 TTGCCCCAGG TGGTGCCCGC 
4 381 GTCAGGAGCG TGAGCCAGGA 

25 444 1 TACACGTCGC CGGTGGTGAA 

4 501 GTGCGGGTGG CGTCCTGGTC 
4 561 CGGTCCGCTG TGAAATGCCG 
4 621 ACCGTACGTA GTCGTAGAAC 
-4681 CCACGCCGAC CGTGCGCCGC 

30 4 741 CGGGCCCGGA CGGGCTGCCG 

4 801 GGGCCCGCAG CGTGCTCAGC 
48 61 CGGCGCACAG CCGGTCGGTG 
4 921 CCTCATCGGC CAGCTCCGCG 

4 981 GGACGAGCAG GCACAGTGCC 
35 5041 CTCGTGGGCT GGTCAGCCCC 

5101 CGGCGGCGTC GCCGCGCAGT 
5161 GGAGGTCGGG CACCAGCCAG 
5221 TGTCGGGGTC GATCAGGGCG 
5281 GCAGGGCGTG GGCGCGGAAG 
40 5341 GGTCGAACAG CGGCACGCCC 
54 01 GCTGGGAGAT GTTGAGCCGT 
54 61 TGAACCACTG CAACTCCCGT 

5 521 CGAGGTTTCG TCATTTCACA 
5581 GACCCCATGG GAGGGACCCC 

45 5641 CCGGGCCCCT GTCCGGTCTG 

5701 CCACCCGCCA CCTGGCGGAC 
57 61 GCGACCTCGC CCGCGGCTAC 
5821 TGAACCGGGG GAAGGAGAGC 
5881 TGCACGCCTT GGTGGACCGG 

50 594 1 GCCGCCTGGC ATCGGCCACC 

6001 CATATCCGGC TACGGCAGTA 
6061 TCCAGTGCGA AGCGGGGCTG 
6121 GCCTGTCCAT CGCGGACATC 
6181 TGCTGAAGCG GGCCCGCACC 

55 6241 TCGGTGAATG GATGGGATAC 

6301 GCGCCGGCGC CAGCCACGCG 

63 61 AGACGATCAA TCTCGGGCTC 

64 21 TACAACGCCC CGGTCTCTGC 
64 81 ACCGCACCGA GCTCGACGCC 

60 6541 TGGTGGCGCG GCTGGAGGAG 

6601 TCAGCGAACA CCCCCAACTG 
6661 GTGCGCTGGA GGGCCTGATC 
67 21 GCCGGGTCCC GGAGCTGGGC 
67 81 ACAGCGCCGA CCGCGAAGAG 
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CGGTGATGTA GGTGCGCTGG GGGTCCGCGC CGTAGGCGGA 
GCCGGATCGA CGCGGCTTCG CCCTGGCCCC TGCGGTTGTC 
AGCACCTGTT CGCGTTGTTC GACGACGTGG TCTCGGCGAA 
GGTCCGCGAA TGAGAGCAGG CCGGAGTTGT CGGCGTAGCC 
CGTGCAGGGC GAACACCACC GCCGGCTCCG CGGGCAGGGA 
TGTTCAGCCG GCCCGGGTTC GTGCCGAAGT CCGCGACCTC 
GACCGGGCTT GGCCAGGCCC GCCGCGGCGT GGGCCGTCGG 
CTCCGAGTAC GAGGGCCACG ACGGCCACGA GACGGG~GAG 
GCGACAACGA CCCGACCGGC GGCGAGGAGG AGAGGGGGAA 
GGAACGGCGG CGGCTGCATG GCGGCTCCCT CGATGTCGTG 
CTGACGTCGA TCAGTGGGAG CGCCCCGGTG CCCGGCACCG 
ACGGTATGGC CCGGAGCACC ACACCCCGCA CCGCGCGATG 
CGCCTTGCGG AATCTGATAC CCGGACGCGA CGAACGCCCC 
CATGGTGTCC GACTCGGCCG GTCGGCCTTG CCTGCCCTGG 
CCGGGCGTCG GCGGGCTGGG CGGTATGGCG GCCGAGGACG 
CGCCCAAGTG CAGTACGCCG PiCCGTGGCCG GCGGGAGGGC 
CGCGGCCCTG CGGGACCGCT CGTCCCAGAC GGGTTCCACC 
TCCGCGGCGG TAGACCATCA GTGTCCGCTC GAAGGTGATG 
GTAGCCGATG GTGCGCACGC TGATGATGCC TACGTCAGGT 
GTTCAGGACC TCGGACTGCG AGTAGATGGT GTCGCCCTCG 
GACCCGGTCC CAGCCGAGGT TGGCCATCAC ATGCTGGGAG 
GGTGACCAGG GCGAGGGTGA AGGTGGAGTC CACCAGCGGC 
CGAGTAGTGG CGGTCGAAGT GCAGCGGCGC GGTGTTCTGC 
GTTGTCGGTC TCCAGGACCG TGCGGCCCAG GGGGTGGCGG 
GTCCTCGAAG TAGCGGCCCT GCCAGCCCTC GACCACAGCG 
CGGGTTCTCA GTCGTCATGG CGCTCATTCT GGGAAGTCCC 
AACCTTCACC GGGCTCATAC GTGCGGCGCA TGAGCC^TGG 
CTCGCCACCA CTGGCGCGCG TGGTCCTCCG GCGAGTGTGA 
GCCTGCGGGT CGTCGAGCGG CACGGCGACG GCGTGGTCAC 
GTGAGGGGGG CGACGGCCAC ACCGAGGCCG GCGGCGACCA 
TCGGTGCTCT CCAGGACGAC CCGCGGCACG AATCCGGCCG 
ATCTGGCGCA GTCCGAAGAC CGGCTCCAGT GCC AC GAACG 
GTCCGCACCC GGCGGCGTCT GGCCAGCCGG TGTCCGGGTG 
TCGTCCCGCA GTGGTGTCCA CTCCACATCG TCCCCGGCGG 
AGGTCCAGCC TGCTGTTGCG GACGTCGTCG ACCACGGCGT 
TCGAAGGTGG TGCCGGGAGC CAGCCGGCGG TACCCGGCGA 
GTGCCGTAGG AGTGCAGGAA ACCCAGTGCC ACGGTGCCGG 
GTGATGCGCT GCTCGGCGCC GGAGACCTCA CTGATCGCGC 
ACCTCGCCGT ACTTGTTGAG CCGGAGCCGG TTCTGGTGCC 
ACTCGTCGCT CCAGCCGCCG GATGGCCCTG GACAGGGTCG 
TCCGCGGTGA TCGTCACGTG CTCGTGCTCG GCCAAGGCCG 
ATCTCCATGC AGGGACTATA CGTACCGGGC ATGGTCCTGG 
GCGGCCGGGC GGCGGCCCAC AGTGAGTCCT CACCAACCAG 
ATGTCCGAGC CGCATCCTCG CCCTGAACAG GAACGCCCCG 
CTCGTGGTTT CTTTGGAGCA GGCCGTCGCC GCTCCGTTCG 
CTGGGCGCCC GTGTCATCAA GATCGAACGC CCCGGC\GCG 
GACCGCACGG TGCGTGGCAT GTCCAGCCAC TTCGTCTGGC 
GTCCAGCTCG ATGTGCGCTC GCCGGAGGGC AACCGGCACC 
GCCGATGTCC TGGTGCAGAA TCTGGCACCC GGCGCCGCGG 
AGGTCCTCGC GCGGAGCCAC CGAGGCTGAT CACCTGCGGA 
CCGGCTGCTA CCGCGGACCG CAAGGCGTAC GACCTCCTGG 
GTCTCCATCA CCGGCACCCC CGAGACCCCG TCCAAGGTGG 
TGTGCGGGGA TGTACGCGTA CTCCGGCATC CTCACGGCCC ' 
GGCCGGGGCT CGCAGTTGGA GGTCTCGATG CTCGAAGCCC 
GCCGAGTACT ACACGCGCTA CGGCGGCACC GCTCCGGCCC 
ACGATCGCCC CCTACGGCCC GTTCACCACG CGCGACGGGC 
CAGAACGAGC GGGAGTGGGC TTCCTTCTGC GGTGTCGTGC 
GACGACCCGC GCTTTTCCGG CAACGCCGAC CGGGTGGCGC 
CTGGTGAGCG AGGTGACGGG CACGCTCACC GGCGAGGAAC 
GCGTCGATCG CCTACGCACG CCAGCGCACC GTGCGGGAGT 
CGTGACCGTG GACGCTGGGC TCCGTTCGAC AGCCCGGTCG 
CCCCCGGTCA CCTTCCACGG CGAGCACCCG CGGCGGCTGG 
GAGCATACCG AGTCCGTCCT GGCGTGGCTG GCCGCGCCCC 
GCCGGCCATG CCGAATGAAC TCACCGGAGT CCTGATCCTG 
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68 4 1 GCCGCCGTGT TCCTGCTCGC CGGCGTACGG GGGCTGAACA TGGGCCTGCT CGCGCTGGTC 
6901 GCCACCTTTC TGCTCGGGGT GGTCGCACTC GACCGAACGC CGGACGAGGT GCTGGCGGGT 
6961 TTCCCCGCGA GCATGTTCCT GGTGCTGGTC GCCGTCACGT TCCTCTTCGG GATCGCCCGC 
7021 GTCAACGGCA CGGTGGACTG GCTGGTACGT GTCGCGGTGC GGGCGGTGGG GGCCCGGGTG 
5 7 081 GGAGCCGTCC CCTGGGTGCT CTTCGGCCTG GCGGCACTGC TCTGCGCGAC AGGCGCGGCC 

7 14 1 TCGCCCGCGG CGGTGGCGAT CGTGGCGCCG ATCAGCGTCG CGTTCGCCGT CAGGCACCGC 
7201 ATCGATCCGC TGTACGCCGG ACTGATGGCG GTGAACGGGG CCGCAGCCGG CAGTTTCGCC 
72 61 CCCTCCGGGA TCCTGGGCGG CATCGTCCAC TCGGCGCTGG AGAAGAACCA TCTGCCCGTC 
7321 AGCGGCGGGC TGCTCTTCGC AGGCACCTTC GCCTTCAACC TGGCGGTCGC CGCGGTJTCA 
10 7 381 TGGCTCGTCC TCGGGCGCAG GCGCCTCGAA CCACATGACC TGGACGAGGA CACCGATCCC 

74 41 ACGGAAGGGG ACCCGGCTTC CCGCCCCGGC GCGGAACACG TGATGACGCT GACCGCGATG 
7 501 GCCGCGCTGG TGCTGGGAAC CACGGTCCTC TCCCTGGACA CCGGCTTCCT GGCCCTCACC 

75 61 TTGGCGGCGT TGCTGGCGCT GCTCTTCCCG CGCACCTCCC AGCAGGCCAC CAAGGAGATC 
7 621 GCCTGGCCCG TGGTGCTGCT GGTATGCGGG ATCGTGACCT ACGTCGCCCT GCTCCAGGAG 

15 7 681 CTGGGCATCG TGGACTCCCT GGGGAAGATG ATCGCGGCGA TCGGCACCCC GCTGCTGGCC 

77 4 1 GCCCTCGTGA TCTGCTACGT GGGCGGTGTC GTCTCGGCCT TCGCCTCGAC CACCGGGATC 
7801 CTCGGTGCCC TGATGCCGCT GTCCGAGCCG TTCCTGAAGT CCGGTGCCAT CGGGACGACC 

78 61 GGCATGGTGA TGGCCCTGGC GGCCGCGGCG ACCGTGGTGG ACGCGAGTCC CTTCTCCACC 

7 921 AATGGTGCTC TGGTGGTGGC CAACGCTCCC GAGCGGCTGC GGCCCGGCGT GTACCAGGGG 
20 7 981 TTGCTGTGGT GGGGCGCCGG GGTGTGCGCA CTGGCTCCCG CGGCCGCCTG GGCGGCCTTC 

8041 GTGGTGGCGT GAGCGCAGCG GAGCGGGAAT CCCCTGGAGC CCGTTTCCCG TGCTGTGTCG 
8101 CTGACGTAGC GTCAAGTCCA CGTGCCGGGC GGGCAGTACG CCTAGCATGT CGGGCATGGC 
8161 TAATCAGATA ACCCTGTCCG ACACGCTGCT CGCTTACGTA CGGAAGGTGT CCCTGCGCGA 
8221 TGACGAGGTG CTGAGCCGGC TGCGCGCGCA GACGGCCGAG CTGCCGGGCG GTGGCGTACT 
25 8 281 GCCGGTGCAG GCCGAGGAGG GACAGTTCCT CGAGTTCCTG GTGCGGTTGA CCGGCGCGCG 

8 34 1 TCAGGTGCTG GAGATCGGGA CGTACACCGG CTACAGCACG CTCTGCCTGG CCCGCGGATT 
8401 GGCGCCCGGG GGCCGTGTGG TGACGTGCGA TGTCATGCCG AAGTGGCCCG AGGTGGGCGA 
84 61 GCGGTACTGG GAGGAGGCCG GGGTTGCCGA CCGGATCGAC GTCCGGATCG GCGACGL'CCG 
8 521 GACCGTCCTC ACCGGGCTGC TCGACGAGGC GGGCGCGGGG CCGGAGTCGT TCGACATGGT 

30 8581 GTTCATCGAC GCCGACAAGG CCGGCTACCC CGCCTACTAC GAGGCGGCGC TGCCGCTGGT 
8 64 1 ACGCCGCGGC GGGCTGATCG TCGTCGACAA CACGCTGTTC TTCGGCCGGG TGGCCGACGA 
8701 AGCGGTGCAG GACCCGGACA CGGTCGCGGT ACGCGAACTC AACGCGGCAC TGCGCGACGA 
8 761 CGACCGGGTG GACCTGGCGA TGCTGACGAC GGCCGACGGC GTCACCCTGC TGCGGAAACG 
8821 GTGACCGGGG CGATGTCGGC GGCGGTCAGC GTCAGCGTCG TCGGCGCGGG CCTCGCGGAG 

35 8881 GGCTCCAGAT GCAGGCGTTC GACGCCGGCG GCGGAAGCGC CCGCCACCTC GGACACGCAG 

8 94 1 GGGCAGTCGG AGTCCGCGAA GCCCGCGAAC CGGTAGGCGA TCTCCATCAT GCGGTTGCGG 
9001 TCCGTACGCC GGAAGTCCGC CACCAGGTGC GCCCCCGCGC GGGCGCCCTG GTCCGTGAGC 
9061 CAGTTCAGGA TCGTCGCACC GGCACCGAAC GACACGACCC GGCAGGACGT GGCGAGCAGT 
9121 TTCAGGTGCC ACGTCGACGG CTTCTTCTCC AGCAGGATGA TGCCGACGGC GCCGTGCGGG 

40 9181 CCGAAGCGGT CGCCCATGGT GACGACGAGG ACCTCATGGG CGGGATCGGT GAGCACGCGC 

924 1 GCAGGTCGGC GTCGGAGTAG TGCACGCCGG TCGCGTTCAT CTGGCTGGTC CGCAGCGTCA 
9301 GTTCCTCGAC GCGGCTGAGT TCCTCCTCCC CCGCGGGTGC GATCGTCATG GAGAGGTCGA 
9361 GCGAGCGCAG GAAGTCCTCG TCGGGACCGG AGTACGCCTC CCGGGCCTGG TCGCGCGCGA 
94 21 AACCCGCCTG GTACATCAGG CGGCGCCGAC GCGAGTCGAC CGTGGACACC GGCGGGCTGA 

45 94 81 ACTCCGGCAG CGACAGGAGC GTGGCCGCCT GCTCGGCCGG GTAGCACCGC ACCTCGGGCA 

954 1 GGTGGAACGC CACCTCGGCA CGCTCGGCGG GCTGGTCGTC GATGAACGCG ATCGTGGTCG 
9601 GTGCGAAGTT CAGCTCCGTG GCGATCTCGC GGACGGACTG CGACTTCGGC CCCCATCCGA 
9661 TGCGGGCCAG CACGAAGTAC TCCGCCACAC CGAGGCGTTC CAGACGCTCC CACGCGAGGT 

97 21 CGTGG TCGTT CTTGCTCGCC ACCGCCTGGA GGATGCCGCG GTCGTCGAGC GTGGTGATCA 
50 97 81 CCTCGCGGAT CTCGTCGGTG AGGACCACCT CGTCGTCCTC CAGCACGGTG CCCCGCCACA 

98 4 1 AGGTGTTGTC CAGGTCCCAG ACCAGACACT TGACAATGGT CATGGCTGTC C7CTCAAGCC 
9901 GGGAGCGCCA GCGCGTGCTG GGCCAGCATC ACCCGGCACA TCTCGCTGCT GCCCTCGATG 
9961 ATCTCCATGA GCTTGGCGTC GCGGTACGCC CGTTCGACGA CGTGTCCCTC TCTCGCGCCT 

10021 GCCGACGCGA GCACCTGTGC GGCGGTCGCG GCCCCGGCGG CGGCTCGTTC GGCGGCGACG 
55 10081. TGCTTGGCCA GGATCGTCGC GGGCACCATC TCGGGCGAGC CCTCGTCCCA GTGGTCGCTG 
1014 1 GCGTACTCGC ACACGCGGGC CGCGATCTGC TCCGCGGTCC ACAGGTCGGC GATGTGCCCG 
10201 GCGACGAGTT GGTGGTCGCC GAGCGGCCGG CCGAACTGCT CCCGGGTCCG GGCGTGGGCC 
10261 ACCGCGGCGG TGCGGCAGGC CCGCAGGATC CCGACGCAGC CCCAGGCGAC CGACTTGCGC 
10321 CCGTAGGCGA GTGACGCCGC GACCAGCATC GGCAGTGACG CGCCGGAGCC GGCCAGGACC 
60 10381 GCGCCGGCCG GCACACGCAC CTGGTCCAGG TGCAGATCGG CGTGGCCGGC GGCGCGGCAG 
10441 CCGGACGGCT TCGGGACGCG CTCGACGCGT ACGCCGGGGG TGTCGGCGGG CACGACCACC 
10501 ACCGCACCGG AACCATCCTC CTGGAGACCG AAGACGACCA GGTGGTCCGC GTAGGCGGCG 
10561 GCAGTCGTCC AGACCTTGTG GCCGTCGACG ACAGCGGTGT CCCCGTCGAG CCGAACCCGC 
10621 GTCCGCATCG CCGACAGATC GCTGCCCGCC TGCCGCTCAC TGAAGCCGAC GGCCGCGAGT 
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10 681 TTCCCGCTGG TCAGCTCCTT CAGGAAGGTC GCCCGCTGAC CGGCGTCGCC GAGCCGCTGC 
1074 1 ACGGTCCACG CGGCCATGCC CTGCGACGTC ATGACACTGC GCAGCGAACT GCAGAGGCTG 
10801 CCGACGTGTG CGGTGAACTC GCCGTTCTCC CGGCTGCCGA GTCCCAGACC GCCGTGCTCG 
10861 GCCGCCACTT CCGCGCAGAG CAGGCCGTCG GCGCCGAGCC GGACGAGCAG GTCGCGCGGC 
5 10 921 AGTTCGCCGG ACGTGTCCCA CTCGGCGGCC CGGTCACCGA CAAGGTCGGT CAGCAGCGCG 
10981 TCACGCTCAG GCATCGACGG CCCGCAGCCG GTGGACGAGT GCGACCATGG ACTCGACGGT 
1104 1 ACGGAAGTTC GCGAGCTGGA GGTCCGGGCC GGCGATCGTG ACGTCGAACG TCTTCTCCAG 
11101 GTACACGACC AGTTCCATCG CGAACAGCGA CGTGAGGCCG CCCTCCGCGA ACAGGTCGCG 
11161 GTCCACGGGC CAGTCCGACC TGGTCTTCGT CTTGAGGAAC* GCGACCAACG CGTGCGCGAC 

10 11221 GGGGTCGTCC TTGACGGGTG CGGTCATGAG AACACCTTCT CGTATTCGTA GAAGCC~CGG 
11281 CCGGTCTTCC GGCCGTGGTG TCCCTCGCGG ACCTTGCCCA GCAGCAGGTC ACAGGGGCGG 
11341 CTGCGCTCGT CGCCGGTGCG TTTGTGCAGC ACCCACAGCG CGTCGACGAG GTTGTCGATG 
11401 CCGATCAGGT CCGCGGTGCG CAGCGGCCCG GTCGGATGGC CGAGGCACCC CGTCATGAGC 
114 61 GCGTCGACGT CCTCGACGGA CGCGGTGCCC TCCTGCACGA TCCGCGCCGC GTCGTTGATC 

15 11521 ATCGGGTGGA GCAGCCGGCT CGTGACGAAG CCGGGCGCGT CCCGGACGAC GATCGGCTTG 
11581 CGCCGCAGCG CCGCGAGCAG GTCCCCGGCG GCGGCCATGG CCTTCTCACC GGTCCGGGGT 
11641 CCGCGGATCA CCTCGACCGT CGGGATCAGG TACGACGGGT TCATGAAGTG CGTGCCGAGC 
11701 AGGTCCTCGG GCCGGGCCAC GGAGTCGGCC AGTTCGTCAA CCGGGATCGA CGACGTGTTC 
11761 GTGATGACCG GGATACCGGG CGCCGCTGCC GAGACCGTGG CGAGTACCTC CGCCTTGACC 

20 11821 TCGGCGTCCT CGACGACGGC CTCGATCACC GCGGTGGCCG TACCGATCGC GGGCAGCGCG 
11881 GACGTGGCCG TCCGCAGCAC ACCGGGGTCG GCCTCGGCGG GCCCGGCCAC GAGTTGTGCC 
11941 GTCCGCAGTT CGGTGGCGAT CCGCGCCCGC GCCGCCGTAA GGATCTCCTC GGACGTGTCG 
12001 ACGAGTGTCA CCGGGACGCC GTGGCGCAGC GCGAGCGTGG TGATGCCGGT GCCCATCACT 
12061 CCCGCGCCGA GCACGATCAG CTGGTGGTCC ACGCTGTTTC CTCCCTCCGG GGTCACCATG 

25 12121 GCAGCGAGTA CGGG7CGAGG ACGTCTTCCG GGGTCGACCC GATCGCGTCC TTGCGGCCGA 
12181 GGCCGAGTTC GTCGGCGAAG CCGAGCAGCA CGTCGAACGC GATGTGGTCG GCGAACGCGC 
12241 TGCCCCTCGA GTCGAGGACG CTCAGGCTGT CCCGGTGGTC CGCCGCGGTG TCCGGTGCCG 
12 301 CGCACAGGGC CGCCAGCGAC GGGCCGAGCT CGCGGTCCGG CAGTTGCTGG TACTCGCCCT 
12 361 CGGCGCGGGC CTGCCCCGGA TGGTCGACGC AGATGAACGC GTCGTCGAGC AGGGTCTCG 

30 12421 GCAGTTCGGT CTTGCCCGGC TCGTCGGCGC CGATGGCGTT CACATGCAGG TGCGGCAGCC 
124 81 GCGGCTCGGC GGGCAGCACC GGCCCTTTGC CCGAGGGCAC CGAGGTGACG GTGGACAGGA 
12 541 CATCCGCGGC GGCGGCGGCC TCCGCCGGAT CGGTCACCTT GACCGGCAGT CCGAGGAACG 
12 601 CGATGCGGTC CGCGAACGAC GCCGCGTGGC CGGGGTCGGT GTCGCTGACC AGGATCCGCT 
12 661 CGATGGGCAG GACCCTGCTG AGCGCGTGCG CCTGGGTCAC CGCCTGTGCG CCCGCGCCGA 

35 12721 TCAGCGTGAG CGTGGCGCTG TCGGACCGGG CCAGCAGCCG GCTCGCGACG GCGGCGACCG 
12781 CGCCGGTCCG CATCGCGGTG ATCACGCCTG CGTCGGCGAG GGCGGTCAGA CTGCCGCTGT 
12841 CGTCGTCGAG GCGCGACATC GTGCCGACGA TCGTCGGCAG CCGGAAGCGC GGATAGTTGT 
12 901 GCGGACTGTA CGAAACCGTC TTCATGGTCA CGCCGACACC GGGGACCCGG TACGGCATGA 
12 961 ACTCGATGAC GCCGGGAATG TCGCCGCCGC GGACGAATCC GGTACGCGGC GGCGCCTCGG 

40 13021 CGAACTCGCC GCGGCCGAGC GCGGCGAACC CGTCGTGCAG CTCGCTGATC AGCCGGTCCA 
13081 TCATCACGTC GCGGCCGATC ACGGAGAGAA TCCGCTTGAT GTCACGTTGG CGCAGGACCC 
1314 1 TGGTCTGCAT GTGTCACCTC CCTTTCGTGG CCGGAGCTGT CTTGGTGGTG CCGCTCGGGG 
13201 CGGCTTCCGT TCTCATCGCA GCTCCCTGTC GATGAGGTCG AAAATCTCGT CCGCGGTCGC 
13261 GTCCGCGGAC AGCACGCCGG CCGGCGTGGT CGGGCGGGTC TCCCGCCGCC AGCGGTTGAG 

45 13321 CAGGGCGTCC AGCCGGGTTC CGATCGCGTC CGCCTGGCGG GCGCCCGGGT CGACACCGGC 
13381 AACGAGTGCT TCCAGCCGGT CGAGCTGCGC GAGCACCACG GTCACCGGGT CGTCCGGGGA 
13441 CAGCAGTTCA CCGATGCCGT CGGCGAGTGC GCGCGGCGAC GGGTAGTCGA AGACGAGCGT 
13501 GGCGGACAGT CGCAGACCGG TCGCCTCGTT GAGGCCGTTG CGCAGCTGCA CCGCGA^GAG 
13561 CGAGTCCACA CCGAGTTCCC GGAACGCCGC GTCCTCCGGG ATGTCCTCCG GGTCGGCGTG 

50 13621 GCCCA 3GACG GCCGCTGCCT TCTGCCGGAC GAGGGCGAGC AGGTCGGTGG ■ GGCGTTCCTG 
13681 CTCGTTGCGG GCGCTCCGGC GGGCCGACGG CTTGGGCCGG CCACGCAGCA GCGGGAGGTC 
13741 CGGCGGCAGG TCGCCCGCCA CGGCGACGAC ACTGCCCGTT CCGGTGTGGA CGGCGGCGTC 
13801 GTACATGCGC ATGCCCTGTT CGGCGGTGAG CGCGCTCGCC CCACCCTTGC GCATACGGCG 
13861 CCGGTCGGCG TCGGTCAGGT CCGCGGTCAG GCCACTCGCC TGGTCCCACA GCCCCCACGC 

55 13921 GATCGACAGC CCTGGCAGCC CTTGTGCACG CCGGTGTTCG GCGAGCGCGT CGAGGAACGC 
13981 GTTCGCCGCC GCGTAGTTGC CCTGACCGGG GGTGCCCAGC ACACCGGCCG CCGACGAGTA 
14 041 GACGACGAAT GCGGCGAGGT CGGTGTCGCG GGTGAGCCGG TGCAGGTGCC AGGCGGCGTC 
14 101 GGCCTTGGGT TTGAGGACGG TGTCGATGCG GTCGGGGGTG AGGTTGTCGA GCAGGGCGTC 
14161 GTCGAGGGTT CCGGCGGTGT GGAAGACGGC GGTGAGGGGT TGAGGGATGT GGGCGAGGGT 

60 14 221 GGTGGCGAGT TGGTGGGGGT CGCCGACGTC GCAGGGGAGG TGGGTGCCGG GGGTGGTGTC 
14 281 GGGGGGTGGG GTGCGGGAGA GGAGGTAGGT GTGGGGGTGG TTCAGGTGGC GGGCGAGGAT 
14 341 GCCGGCGAGG GTGCCGGAGC CGCCGGTGAT GACGACGGCC CCCTCGGGGT CCAGCGGCCG 
14401 CGGGACCGTG AGGACGATCT TGCCGGTGTG CTCGCCGCGG CTCATGGTCG CCAGCGCCTC 
14 4 61 GCGGACCTGC CGCATGTCGT GCACCGTCAC CGGCAGCGGG TGCAGCACAC CGCGCGCGAA 
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14 521 CAGGCCGAGC AGCTCCGCGA TGATCTCCTT GAGCCGGTCG 

14 581 CAACGGTCGC TGGACGGCGT GCCGGATGTC CGTCTTCCCC 

14 641 CGGCGCGAGC AGGCCGACGG ACGCGTCGAG GAGTTCACCG 

14 701 GTCGACCGGC GGGAACGCGT CGGCGAACGC GGTGCTGCGG 

5 14 7 61 GTCCAGGTCC ACCAGATGGC GCTTCGCGGC GCTGGTGGTC 

14821 GTGCCGCGCG ATCTGCCGGG CGGCGGAACC GACACCGCCG 

14 881 CTTCTCGCCG GGGCGCAGCC CGGCGAGGTC GACCAGGCCG 

14 941 GGTCATCACG GACGCCGCCT GCGGGAACGT CCAGCCGTCC 
15001 GTGGTCGGCG ATGACCGTGG GGCCGAAGCC GGTGCCGACG 

10 15061 CGGTGCCAGA CCGGAGACGT CGGCGCCGGT CTCCAGGACG 

15121 GAGCACGCCC TGACCGGGGT AGGTGCCGAG CGCGATCAGC 

15181 CGCCGCACGC ACACCGATCC GGACCTCGGC CGGGGCGAGG 

15241 GTCGGCCGCG GTGAGGCCGT CGAGGGTGCC CGTCCGCGCC 

15 301 GCTGTCCGGC ACGGTGAGCG GCTCCGGCAC CCGGGTGAGG 
15 15361 GCCGCGCAGC CGCAGACGCG GCTCGCCGAG TGCGACGGCG 

15421 GAGCGTGACG CCGGACTCGG TCTCGACGTG GACGAACCGG 

15481 GGCGCGCAGC AGTCCGGCCG CCGCGCCGG7 GGCGAGGCCC 

1554 1 ATCCCCGCCG GAGCCGGTCA GGGCGGTCAG CAGCCGGGTG 

15 60i CACCGGGTCG TCGCCATCAG CGGCAGGCAA CGTGATGACG 

20 15661 ATCCCTGGGT GCGGCGACCT CGATCCAGGT GAGACGCATC 

15721 GGACAGCGGG CGGGTGCGGA CCGTCCGGAT CTCGGCGACG 

15781 GACGCGCAGA CTCAGCTCGT CGCCGTCACG AGTGATCACG 

15841 CGTGGCGACG AACCGGGCCC CCTTCCAGGC GAACGGCAGA 

15 901 CGTGGTGAGG GCGACGGCGT GCAGGGCCGC GTCGAGCAGC 

25 15961 GTCCGCCTCG GCGGCCTGCT CGTCGGGCAG CGCCACCTCG 

16021 ACGCCAGGCA GCCCGCAACC CCTGGAACGC CGACCCGTAC 

16081 TTCGTCATAG AACCCCGAGA CGTCGACGGC CACGGCCGTG 

1614 l" CGGCTCCACA CCGACAACAC CGGGGGTGTC GGGGGTGTCG 

16201 GTGCCGGGTC CAGCTGCCCG TGCCCTCGGT ACGCGCGTGG 

30 16261 GGCCTCATCA GCCCCTTCCA CGGTCACCGA CACATCCACC 

16321 AAGGGGGGAT TCGATGACCA GCTCGTCCAC TATCCCGCAA 

16381 GATGACCAGC TCCACAAACG CCGTACCCGG CAGCAGGACC 

16441 AGCCAGCCAG GGGTG AGTGC GCAATGAGAT CCGGCCAGTG 

16501 GGCGGGCAGC GCTGTGACAG CGGCCAGCAT CGGATGCGCC 

35 16561 CGACAGATCG GTGGCACCGG CCGCCTCCAG CCAGTACCGC 

16621 GGGCAGATCC AGCAGCCGTC CCGGCACCGG TTCGACCACC 

16681 GCCCAGGGTC CACGCCTGCG CCAACGCCGT CAGCCACCGC 

16741 CCGCAACGAC GCCACCGTGT GAGCCTGCTC CATCGCCGGC 

16801 GCACTCCACG AACACCGACC CATCCAGCTC CGCCACCGCC 

40 16861 ACGCAGATTC CGGTACCAGT ACCCCTCATC CACCGGCTCC 

16921 GGTCGACCAC CACGCCACCG ACGCGGCCTT CCCTGCCACC 

16981 TTCATCCTCG ATGGCTTCCA CGTGGGGCGT GTGGGAGGCG 

17041 CACCCGCACG CCTTCGGCCT CATACCGCGC CACCACCTCC 

17101 CGCCACCACC GTCGAAGCCG GGCCGTTACG CGCCGCGATC 

45 17161 GACCTCACCG GCCGGCAACG CCACCGAAGC CATCGCTCCC 

17221 GATGACCTGA CTGCGCAATG CCACCACGCG GGCGGCGTCC 

17281 CACGCACGCC GCCGCGATCT CGCCCTGGGA GTGTCCGATC 

17 341 ATGCGCCTGC CACAGCGCGG CCAGGCTCAC CGCGACCGCC 

17401 CTCCACCCGC TCCGCCACAT CCGGCCGCGC CAACATCTCC 

50 17461 CGGCAGCAAC GCCTGAGCGC ACTCCTCCAT ACGCGCGGCG 

17 521 GAGTTCCACG CCCATGCCGA CCCACTGGGC GCCCTGGCCG 

17 581 CGGCTGGTCC ACCGCCACAC CCGTCACCCG GGCATCGCCC 
1764 1 GAAGACAGCA CGCTCCCGCA CCAACCCCTG CGCGACCGCG 
17701 GCGCAGATAC CCCTCCAGCC GCTCCACCTG CCCCCGCAGA 

55 17761 CACCGGCAAC GGCACCAACC CGTCAACAAC CGACTCCCCA 

17821 CTCAAGGATC ACGTGCGCGT TCGTACCGCT CACCCCGAAC 

17881 TGCCCGATCC GACTCGGGCC ACGGCCTCGC CTCGGTGAGC 

17941 CCAGTCCACA TGCGACGACG GCTCGTCCAC ATGCAGCGTC 

18001 CATCGCCATG ACCATCTTGA TCACACCGGC GACACCCGCC 

60 18061 GTTCGACTTC AACGAACCCA GCAGCAGCGG AACCTCACGC 

18121 AATGGCCTGC GCCTCGATGG GATCGCCCAG CGTCGTCCCC 

18181 GTCCACATCG GCGGCGCGCA GTCCGGCGTT CACCAACGCC 

1824 1 GGACGGGCCG TTGGGGGCGG ACAGCCCGTT GGAGGCfcCCG 

18 301 GCGGACGACC GCGAGAACGG TGTGTCCGTT GCGCTCGGCG 
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GGCCCCGCGT CCATCAGGTC 
ATCTCGATGA ACCGGCCACC 
GTGAGCGAGT TGAGCACGAC 
GAATCGGCCA GATGCGCTCC 
GCGTACACCT CCGCGCCChG 
GTGGCCGCGT GGATCAGGAC 
TACCACGCGG TCGCGAACGC 
GGCATCCGGC CGAGCATCCG 
AGGCCGAAGA CGCGGTCGCC 
ATGCCCGCGG CCTCGCCGCC 
ACATCGCGGA AGTTGAGGCC 
GGGCGCCGGG GCTCCGCCGA 
GGCCGGATCA GCCACGTGTC 
CGGGCCGCCT CGAACCGGCC 
ATGCGCTGCT GCTCGGGGGC 
CCGGGCTGCT CGGCCTGGGC ■ 
GCGGTGGTGT GCACGAGCAG 
GTGAGCGCAC GCGTCTCGGC 
TCCACGTCGG TCGCGGGGAC 
AGGCCGGTGC CGACGGGTGG 
AGTTGGCCGG CGGAGTCGGC 
GCTCGGAGCA TGGCCGAGCC 
CCCGCAGCGC TGTCGTCCGG 
GCCGGATGCA CACCGAAACC 
GCATACACGG TGTCACCATC 
TCATAACCGG CATCCCGCAG 
ACCGGCGGCC ACTGCGAGAA 
GGGGTCAGGG TGCCGCTGGC 
ACGGTCACCG GCCGCCGTCC 
GCTGCGGTCA CCGGCACCAC 
CCGGTCTCGT CACCGGCCCG 
GTGCCCCGCA CCGCGTGATC 
AGAACAACAC CACCATCGTC 
GCACCCGTCA ACCCCGCCGC 
CTGTGCTCGA ACGCGTACGT 
GTGTCCCAGT CCACTGCCGT 
TCCCAGCCGC CGTCACCGGT 
AGCAGCACCG GATGGGCACT 
GCGTCCAACG CCACCGGACG 
GTCACCCAGG CGCTGTCCAC 
CCCTCCAGTA CCTTGGCCAG 
TAGTCGACCG CGATACGACG 
TCCACCGCCG ACGGGTCCCC 
CACACACCCT CGACCAGACC 
CGCCCGGCCA GTCGCGCCGC 
TCGAGGCTGA GGGCTCCGGC 
ACCGCGTCC.G GCACGACCCC 
CAGCTGGCCG GCTGGACCAC 
CGCACATCCC AGCCCGTGTG 
AACACCGCGG AGTGGGCCAT 
GGGAAGACGA ACACCGTACG 
AGCAGCACCG CACGGTGACC 
GCCACATCCA CACCACCCCC 
CTCACCTCAC CACGAGCCGA 
CGCGACGGCC CAGGAACACC 
GACGACACAC CCGCATGCGG 
AGCTCCACCG CACCGGCCGA 
TTCGGCGCGA TCCCGTACCG 
GCCGCCTGCG CATGACCGAT 
TCCTGCCCGT ACGTCGCCAG 
GTCCCGTGCG CCTCCACCAC 
TGCTGGATGA CACGCTGCTG 
TCCTGGTTCA CCGCCGACCC 
TCGGAGAGCC GCTCCAGCAC 
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18 361 AAGAACGCCG GCGCCCTCCG CCCAGCCGGT GCCGTTGGCG 
18421 GCGGCCGTCG GGGGAGAGTC CGCCCTGCTG CTGGAATTCC 
18481 CATGACGGTG ACACCGCCGA CCAGCGCCAG CGAGCACTCC 
18 541 GGCCTGGTGC AGCGCGACCA GCGACGACGA GCACGCCGTG 
5 18 601 CTGGAGCCCA TAGAAGTACG AGATCCGGCC GGTGAGCACG 
18 661 GCCGAACCCG TCCAGGTCCG CGCCGACGCC GTACCCGTAC 
18721 GCCGGTGTCG CTGCCGCGCA GTGTGCCCGG CACGATGCCC 
18781 TGTCGTTTCC AGCAGGATCC GCTGCTGGGG GTCCATGGCC 
1884 1 GCCGAAGAAC GCGGCATCGA AGCCGGCGGC GTCGGAGAGG 

10 18 SOI CGATCCGCCG GTGAGGCCGG ACGGGTCCCA GCCACGGTCG 
18 961 GTCGCCGCCA CTGTCCACCA TGCGCCACAG GTCGTCGGGC 
19021 TCGGCAGGCC ATGCCCACGA TGGCCAGCGG TTCGTCACGG 
19081 AGCGACCGGT GCGGCACCAC CGACCAGAGC CTCGTCCAAC 
19141 CGTCGGGT AG TCGAAGACAA GCGTGGCGGG CAGTCGGACA 

15 19201 GTTCCGCAGT TCGACGGCGG TCAGCGAGTC GATACCCAGT 
19261 GGACACGTCC GCGGCGTCCG CGTGGCCGAG CACCGCCGCC 
19321 CAGCAGCGCG GTGTCCCGCT CAGCGCCGGA CATGGTGCCG 
19381 GGCGGTGGCC GCCGCCGGGC GCGATACGGC GCGGCGCAGA 
19441 GTGCGCGGTG AGGTCCATCG TGGCCGCCAC GGCGAACGCG 

20 19501 TTCCAGCAGG CGCATGCCCA CACCGGCCGA CATGGGGCGG 
19561 GGTGCGGTTG GTGCCGCTCA TGCTGCCGGT GAGTCCGCTG 
19621 GGCCAGCGAC AGCGCGGGCA GTCCTTCGGC ATGGCGCAGC 
19681 CCCGTTCGCC GCCGAGTAGT TGCCCTGGCC GCGGCCGCCC 
19741 GTAGAGGACG AACGAGCGCA GGTCCGCGTC CCGGGTCAGC 

25 19801 GTCGGCTTTG GGGCGCAGTG TGGTGGCGAG CCGCTCCGGG 
19861 GTCGTCGAGC ACGGCTGCCG TGTGGAAGAC CGCCGTGAGC 
19921 CGCGGCGGCG AGCTGGTCCC GGTCGGCGAC GTCACAGCGG 
19981 CGCCGGCGGT TCGCTGCGCG ACAGCAACAG GAGGTGGCGG 
2 0041 ATGCCGGGCG AGGAGACCTG CCAGCACACC CGAGCCGCCG 

30 20101 CGGGTCGAGC AGCGGTTCGG GCGTTTCCGC GGCGGCCGTG 
20161 GTACCGGCCG TCGGTGACGC GGACGTACGG CTCGGCCAGT 
20221 CTCGATGGGG GTGTCGGTGC CGGTCTCCAC CAGCACGAAC 
20281 GGCGGACCGG ACGAGGCCGG CGACCGCTCC TCCGACCGGT 
2034 1 GAGGGTGGTC TCCGCAGGGC CGTCCTCGGC GATCACCCGG 

35 20401 CTCGGTGAGC CGGTACGTCT CGTCGAGGAC ATCCGCGCCC 
204 61 GATGTGGACC GCGTCCGCAG GACCGGGCCC GGGAGTGGGC 
20521 GTACAAGGAG TTCCGTACGA CGGCGGCGTC GCCGTCGACG 
20 581 CGCGGCGACG GTCACCACCG GTTGGCCGAC CGGGTCCGTC 
2064 1 CGGGCCCTGA GTGATCGTGA CGCGCAGCGT GGTGGCCCCG 

40 20701 GCTCCACGAG AACGGCAGCC GCACCTCCGC TTCCTGTTCC 
20761 GACGTGCAAG GCCGCGTCGA ACAGCGCCGG GTGGACGCCA 
20821 CTGTTCCCCG GCGATCTCCA CCTCGGCGTA CAGGGTTTCG 
20881 CAGTCCCTGG AACGCTGGGC CGTAGCTGTA GCCGGTCTCG 
20941 GCTCACGTCG ACGCGTCGCG CGCCCGGCGG CGGCCACGCG 

45 21001 GCTTCCGGCC CGGCCGAGGG TGCCGCTGGC GTGCCGGGTC 
21061 ACGCGCGTGG ACGGTCACTC GCCGCCGTCC GGCCTCATCG 
21121 CACATCCACC GCGCCGGTCA CCGGCACCAC GAGCGGGGTC 
21181 CACCCCGCAA CCGGTCTCGT CACCGGCCCG GATGACCAGC 
212 4 1 CAGCAGAACC GTGCCCCGCA CCGCGTGATC AGCCAGCCAG 

50 21301 CCGGCCAGTG AGAACAACAC CACCACCGTC GTCGGCGGGC 
21361 CATCGGATGC GCCGCCCCGG TCAGCCCGGC CGCGGACAGA 
214 21 CAGCCAGTAC CGCCTGTGCT CGAACGCGTA GGTGGGCAGA 
214 81 CGGTTCGACC ACCGTGTCCC AGTCCACTGC CGTGCCCAGG 
2154 1 CGTCAGCCAC CGCTCCCAGC CGCCGTCACC GGTCCGCAAC 

55 21601 TTCCATCGCC GGCAGCAGCA CCGGATGGGC GCTGCACTCC 
21661 CTCCGCCACC GCCGCGTCCA GCGCGACGGG GCGACGCAGG 
21721 ATCCACCGGC TCGGTCACCC AGGCGCTGTC CACCGTGGAC 
21781 CCCGCCGGAA ATCCCCTCCA GTACCTCGGC CAACTCGTCC 
2184 1 CGTGTGGGAG GCGTAGTCGA CCGCGATACG GCGCACTCGC 

60 21901 CGTCACCACT TCTTCCACCG CGGACGGGTC CCCCGCCACC 
21961 ACGCGCCGCG ATCCACACGC CCTCGACCAG GTCCACCTCA 
22021 AGCCATCGCC CCCCGCCCGG CCAGCCGCCC GGCGATCACC 
22081 GCGGGCGGCG TCCTCAAGGC TGAGGGCTCC GGCCACACAC 
22141 GGAGTGTCCG ACCACCGCGT CCGGCACGAC CCCATGCGCC 
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GCGTCCGCGA ACGCGCGGCA 
ACGAACCCGG TCGGGGTCGC 
CCGTGGCGCA GTGCGTGCCC 
TCCACCGTGA ACGCCGGTCC 
CTGGGCTGCA TGCCGATCGA 
GAGAAGGCGC CCATGAACAC 
GCGCTCTCGA ACGCCTCCCA 
CGTGCCTCAC GGGGGCTGAT 
AAGCCGCCGC GGTCCGTGTC 
GCCGGGAAGC CGGTGACCGC 
GAGGTGACGC CGCCCGGCAG 
GTCGCGGCGG CTGTGGGAAC 
CGCGACGCGA TGGCCOJCGG 
CCGGTCGCCG CGGCGAGTCG 
TCCTTGAAGG CCGCGTCCGC 
GCGTTGTCGC GGACCAGTGC 
AGCCGGTCGG CGAGCGGAAC 
TCGGCGAAAA GCGGCGATGT 
GTGCCGGTTC CGGCCGCGGC 
AAACCGCCGC GGCGGACACG 
TCATCGGCCC AGAGGCCCCA 
GTCGCGAGTC CGTCGAGGAA 
ATGATGCCCG CGACGGACGA 
TCGTGCAGGT GCCAGGCGCC - 
GTGAGTGCCG TGGTCACGCC 
GGCCTGCCGG CGGCGGCGAG 
ATGTGGACAC CGGGAGTGTC 
GCGCCATGCT CGGCGACGAG 
GTGATGACCA CCGTGCCGTC 
CGGGTGAACC GCGGCGCT,TC 
GTCGTGGCGG CGGCCAGCCC 
CGGCCCGGGT GCTCGGCCTG 
CCCGCGTCGA TCCGGACGAC 
TGCAGCTCGC CGAGCACGAA 
GGTTCCGGGA GCGCGGAGAC 
AGCTCGGTCC AGGAGAGGCC 
TTCACCGGTC GCGCGGTCAG 
GCATGCACGG CAGCGCCGTC 
GTCGTGTGGA ACCGCACGCC 
GCGAGCAGCG GCAGGCAGGT 
TAGTGCGGCG TGTCGTCGGC 
CCGTCGCGCC AGGCGGTGCG 
GCCAGCCGCT CGTAGAACGC 
GGCGGCGGGA CCGCCGCGAC 
CAGCTGTCCG TGCCCTCGGT 
GCCCCTTCGA CGGTCACCGA 
TCGATGACCA GTTCATCCAC 
TCCACAAACG CCGTACCCGG 
GGATGCGTAC GCAACGAGAT 
AGTGCTGTGA CGGCGGCCAG 
TCGGTGGCAC CGGCCGCCTC 
TCGAGCAGCC GTCCCGGCAC 
GTCCACGCCT GCGCCAACGC 
GACGCCACCG TGTGAGCCTG 
ACGAACACGG ACCCGTCCAG 
TTCCGGTACC AGTAGCCCTC 
CACCAGGCCA CCGACCCGGT 
TCGATGGCTT CCACGTGGGG 
ACGCCTTCGG CCTCGTACCG 
ACAGTCGAAG ACGGGCCGTT 
CCGGCCGGCA ACGCCACCGA 
TGGCTGCGCA AGGCCACCAC 
GCCGCCGCGA TCTCGCCCTG 
TGCCACAGCG CGGCCAGGCT 
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222 01 CACCGCGACC GCCCAGCTGG 

222 61 CGCCAACATC TCCCGCACAT 

22321 CATACGAGCC GCGAACACCG 

22 381 AGCACCCTGC CCGGGAAAGA 

5 224 4 1 CCGGGCATCG CCCAACAACA 

22501 CTGCGCGACC GCGGCCACAT 

225 61 CTGCCCCCGC AGACTCACCT 

22 621 AGCCGACTCC CCACGCGACG 

22 681 GCTCACCCCG AAAGCGGAGA 

10 2274 1 CGCCTCGGTG AGCAGTTCCA 

228 01 CACATGCAGC GTCTTCGGCG 

228 61 GGCGACACCC GCAGCCGCCT 

22 921 CGGAACCTCA CGCTCCTGCC 

22 981 CAGCGTCGTC CCCGTCCCGT 

15 2 3041 CTTGTGGAGG GCCTGGCGGA 

23101 GTTGGAGGCG CCGTCCTGGT 

23161 GTTGCGCTCG GCGTCGGAGA 

23221 GGTGCCGTCC GCCGCGTCAG 

2 3281 CCGGGAGAAC TCCACGAAGG 

20 2 3 341 CAGCGAGCAC TCCCCGGTCC 

234 01 CGAACACGCC GTGTCGACCG 

2 34 61 TCCGGCGAGC ACCGCGGGCT 

23521 GCCGTAGCCG TAGTAGAAGC 

2 3581 CGGCACGATG CCGGCGTGTT 

25 2 3641 CGGGTCGAGT GCGGTGGCCT 

2 3701 GGCGCCCGCG AGTGCGCCGG 

2 3761 CACGTCCCAG CCGCGGTCGG 

2 3821 CTGCCACAGC TCTTCCGGTG 

2 3881 GGCGAGCGGC TCGTTCGCCG 

30 2 3941 GTCCTTGACC GACGTCCGCA 

2 4 001 TCAGCACGTG CGCGATGAGC 

24 061 CCGCGGTCGT GGTGCTCGCG 

24 121 TGTCGTCCGG GGTCCCGTTG 

24 181 CGCCGGCGGC GGGATAGTCG 

35 24 241 AGACCCGGTT GCGCAGGCCG 

2 4 301 TGGTGGCCGT GACCGCCGCC 

2 4 361 CGACGCCGAG CAGCACCTGT 

2 4 421 GGGAGCCGCC GTCGGTCGCG 

24 481 ACGGGTCGCC GGGCCCGGGT 

40 2 4 541 CGGCGTCGAG GAGGTCGGTC 

2 4 601 CTTGTGCCCG GCGCAGGTCG 

2 4 661 CGGCGAGAAC GAACGCGGTC 

24 721 ACTCGGCGGT GCCGTCCGCG 

2 4 781 GCTCGTACCG GATCACTTCG 

45 2 4 841 CGCCCGCGAG GAGGACGGTG 

24 901 CGAGGCGGGG CGCTTCGAGG 

2 4 961 AGAGGGCGGC GGCGCGGCGG 

2 5021 CCGGTTCCGC GGTGTCGAGC 

25081 ACACCACCAG CGTGGCGCCG 

.50 2514 1 GACCGGATAC CGGGACGACG 

2 5201 GGCGGGCCGT GGTGCCGGGT 

2 5261 GCCGCACGTC CCCGTCCGGG 

25321 GAGCCACCGG CCGTCCCAGT 

25381 CGTGGACGAA GGTGACGCGC 

55 2544 1 ACGCGAACGG CAACCGTACC 

25501 GCGCGGTGAC GAGCAGCGCC 

25561 CGTCGAGGGC GACTTCGGCG 

2 5621 GGAACTCGGG GCCGAACTCG 

2 5 68\ CGACCGGTTC CGCGTGCTCG 

60 257 4 1 CGATGCCGGC GAAGCCGGAG 

25801 GGACGCGCAC GGCACGGCGT 

2 5861 CGGCGCCGGT GGCGGGCAGG 

2 5921 AGCCTGCCTC GTCGGCGCCG 

2 5 981 CGGCGCCGTC GACGGAGTGA 
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CCGGCTGGAC CACCTCCACC CGCTCCGCCA CATCCGGCCG 
CCCAGCCCGT GTGCGGCAAC AACGCCCGCG CACACTCCTC 
CAGAACACGC CATCAACTCC ACACCCATGC CCACCCACTG 
CGAACACCGT ACGCGGCTGA TCCACCGCCA CACCCATCAC 
CCGCACGGTG ACCGAAGACA GCACGCTCAC GCACCAACCC 
CCACACCACC CCCGCGCAGA TACCCCTCCA GCCGCTCCAC 
CACTCCGAGC CGACACCGGC AACGGCACCA ACCCATCGAC 
GCCCGGGAAC ACCCTCAAGG 'ATCACGTGCG CGTTCGTACC 
CACCGGCCCG GCGCGGACGT CCCGCGTCGG GCCACGCCCG 
CCGCGCCCTC GGTCCAGTCC ACATGCGACG ACGGCTCGTC 
CGATGCCATA CCGCATCGCC ATGACCATCT TGATGACACC 
GCGCATGACC GATGTTCGAC TTCAACGAAC CCAGCAGCAG 
CGTACGTCGC CAGAATCGCG TGCGCCTCGA TGGGATCGCC 
GCGCCTCCAC CACGTCCACG TCGGCGGGGG CGAGCC^'CGC 
TGACGCGCTG CTGGGAGGGG CCGTTGGGTG CGGAGATGCC 
TGACGGCGGA GGAGCGGACG ACCGCGAGGA CGGTGTGTCC 
GCTTTTCGAC GACGAGGACG CCGGCCCCCT CGGCGAAACC 
CGAACGCCTT GCACCGTCCG TCCGGCGCGA CGCCGCCCTG 
TCTGTGGTGA TGCCATCACT GTGACACCAC CGACCAGCGC 
GCAGCGCCTG CCCGGCCTGG TGCAGCGCGA CCAGCGACGA 
TGACCGCCGG ACCCTCCATG CCGAAGAAGT ACGACAGCCG 
GTGTGCTGTA GGCGCCGAAT CCGCCCAGGT CCGCGCCCGT 
CGCCGACGAA GACGCCGGTG TCGCTGCCGC GCAGGGTGTC 
CGAGCGCCTC CCAGGCGATT TCGAGGAGGA TCCGCTGCTG 
CGCGCGGACT GATGCCGAAG AACGCGGCAT CGAAGTCGGC 
CCCGCCCGGT GGCGGACTCG GCGGCGGCGT GCAGCGCGGC 
TGGGGAAGTC GCCGATCGCG TCGCGGCCGT CCGCGACGAG 
AGGTGACGCC GCCCGGCAGT CGGCAGGCCA TGCCGACGAC 
CGGCGCGCAG CGCGGTGTTC TCCCGGCGGA GCTGCGCGTT 
GCGCCTCGAT CAGGTCGTTC TCGGCCATCG CCTCATCCCT 
GCGTCTGCGT CCATGTCGTC GAACAGTTCG TCGTCCGGCT 
GGTGCCTGTG CCGGTGGTTC ACCGCCGTCC GGGGTCCCGT 
ACGTCCGGGG CCAGGAGGGT CAGCAGATGA CGGGTGnGCG 
AAGACGAGCG TGGCCGGCAG CGGAATGCCG AGGGCCTCGG 
AGCGCGGTGA GCGAGTCGAC CCCGAGGTCC TTGAACGCCG 
GCGTCGGTGT GGCCCAGCAG GGTGGCGGCG GTGTCGCGGA 
TCCCGTTCCT TGTGGGGCAG GTCCGGCAGG CGTTCCAGCA 
GAGCGCCGGG TGGGGCGCTG GATCGGTCGC CACAGCGGTG 
GGGGCGGTCG CCACGACCAC GGCTTCCCCG GTGGCGCACG 
AGCCGGTCCG CCGCGGCGGT GAACGCCACG GCCGGCAGGC 
GCCAGGGCCT GGAGCGGTCC GGCCGCCTCG CCGGACGGAA 
AGGTCGAGGT CGCGGGTCAG GCGGTGCAGT TCCCAGGCCG 
TGGACGACCG CGGTCACCGG GGTTTCCGGC ACTGTGCCCG 
GCGCCGTGTC CGCCGAGGTG TCCGGCGAGT TCCTCCGAAC 
TCGCCGTACG AGGCCGCGGC CGTGGTGGGC GCGGCGGGGA 
CGCCCGTCGG CCAGGCGCAG GTGCGGTTCG TCGAGGCGGG 
GGGGTGACCG TGTCGGTGGT CTCCACGAGC ACGAGCCGGC 
AGTGCGGCGA CGGCACCGGC GACGGGCCCG GCCTCGGCGG 
GCGGTCCTCG GGTCGTCCAG TGCGGTACGG ACCTCGTCGG 
ATGACGTCGG GCGTGGCGTC GTCGCCGAGG TCGGTGTACC 
GCCGCCGGGG CCCGGACGCC GGTCCAGGTG CGCCGGAACA 
CCCGTCGTGG CGGGGGGCCG GGTGATGAGC GAGCCCiTCT 
TCGTCGGCGA GGTGCACGCG GGCGCCGCCC TCGCCCTCGC 
AGTTTCGTGG CGCCGCTGGT GTGGACACGG ACGCCGGTGA 
CCCGCGTTCT CGGCGGCCGC GCCGATGCTG CCCGCTTGCA 
GGGTGCAGTG TGTAGCGGGC GGCGTCCCTG GCGAGGGCGC 
CAGACGGTGT CTCCGTGGCT CCACGCGGCG GACATGCCGC 
TATCCCGCGT CGTCGAGTCG CTGGTAGAAG GCCGCGACGT 
GGCGGCCAGG GCCCCGGCGT GGTGGCCGGT TCGGTGGTGG 
GCGTGGCGGG TCCATGTCCG GTCGCCGTCC GTCCGGGCGT 
CCGGTGTCGT CGGGCGCGGC GACGGTCACG CGCACCTGGA 
ACCAGCGGTG TCTCGACGAC CAGTTCGTCG AGCAGGTCGC 
CGTCCGGCCA ATTCCAGGAA GGCGGGTCCG GGCAGCAGTA 
CCGGCCAGCC ATGGGTGGGT GGCCAGCGAG AACCGGCCGG 
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2 6041 TGAGCAGCAC CTCGTCGGAG 
2 6101 CGGCGTCGAG TCCGAGGCCG 
2 6161 CATGGTGGAA GGCGTATGTG 
2 6221 CCCAGTCGAC GGGCACGCCG 
5 2 6281 CTCCCCCGCC GCGGCGGAGC 
2 6 341 GGTGCGCGCT GACCTCGACG 
2 6401 TGGCGAAGCC TACGGGGTGG 
2 6461 CGATCCAGCG TTCGTCGGCG 
2 6521 CCGCGACGAT CCGCTGGAGT 

10 2 6581 AGTCGACGGC GATGCGGCGC 
2 664 1 CGACGGCGTC CGGGCGCCCG 
2 6701 AGACGCCGTC CATCCGGGCG 
2 6761 CCATCGCGCC GCGTCCGGCG 
2 6821 GGCGGGCACC GTCCTCCAGG 

15 2 6881 GGGAGTGTCC GATGACGGCG 
2 6941 ACACCATGAC GGCCCAGCAG 
27 001 CGTCGAGCAT GGCGATGGGG 
27 061 GCATCCTGGC GGCGAACACC 
2 7 121 GCGGTCCTTG TCCGGGGAAG 

20 27181 CGACGTCGTC GTCGAGCAGC 
27 241 CCGCGGCGAT GGCGCGCGGG 
2 7 301 GGACCTGGCC GTCGAGGGCC 
27 361 TGGCGATCAG CGGCTCACCG 
27421 CCGGGTGGGC TTCCAGCAGG 

25 274 81 CGGCGCGCCG CGGGCGGTCG 
27 541 CGCCGGCCGT CCAGTCGACG 
2 7 601 TGCCGTGCCG CATGGCGAGG 
27 661 TGTGGCCGAT GTTGGACTTC 
2 7 721 AGGTGGCCAG CACCGCCTGT 

30 2 7 781 CCTCCACGGC GTCCACGTCC 
2784 1 CCCGCTCCTG CGAGGGCCCC 
2 7 901 CCGCCGAACC CCGGACAACC 

27 961 TCTCGACGAT CAGCACACCG 
28021 ACGCCTTGCA GCGCGCGTCG 

35 28081 ACGCCGAGGC CATCACCGTG 
2814 1 GTGACTGCCC GGCCTGGTGC 
28201 CCGCCGGACC CTCCAGACCG 
282 61 TGCCGGTCGC GCCGAAACCG 

28 321 CCATGAACAC GCCGGTGTCG 
40 23 38 1 GCGCCTCCCA CGAGGTCTCC 

284 4 1 GCGGACTGAT CCCGAAGAAC 
28 501 GACGCACGGT CGACGTGCCC 
28 561 AACCACGGTC CGTCGGAAAC 
2 8 621 AGTCCTCCGG CGACGCGACC 

45 28 681 GCTCGTCCTG CCGGACGGCC 
2 8741 GCGCCGCGGT GAGCTTCGCC 
28801 CGGGCAGCCG TACGCCCGTC 
288 61 AGTCGACGCC GAGTTCCTTG 
28 921 CGAGTACGGC CGCGGTGCAC 

50 28 981 CGGAGAGCCG CGCGATCCGG 
2904 1 CCCGGCGCGG TGCGCGCAGC 
2 9101 GCGCCGGGTC CGAGGACCGC 
29161 GCGCCGTCAC GCCGTCGCGG 
2 9221 GTTCCCACAG GCCCCAGGCC 

55 2 9281 CCAGCGCGTC GAGGAACGCG 
29341 CACCGGCGGC CGACGAGTAG 
2 9401 GCAGGTGCCA CGCGGCGTCC 
2 9461 GCGCGGTGAG GACGCCGTCG 
2 9521 GCGCCGGGTC GATCCCCGCC 

60 2 9581 CGATCGCCGT GACCTCGGCG 
2964 1 GCAGCCGGCG CACGCCGTGG 
2 9701 CGGAGCCACC GGTGACGAGC 
2 9761 GGACCGCCGG GGCCAGACGG 
29821 CATCGAGCGC GGTGGCCGCT 
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TCGGGGAGCG CCACCGACGC GGCGAGCAGC GGGTGGTCGA 
GAAGCGTCCG TGCCGGCCGC GGTCTCGATC CAGTAGCGCT 
GGCAGGTCGT GTGCCGTCGC CGTCGCGGGG ACGACCGCCG 
GTTGTGTGCG CCTCGGCCAG CGCGGTGAGC AGCCGGTGGA 
GTGGCGACGG TCGCGCCGTC GATCGCGGGC AGCAGCACGG 
AACACGGTGT CACCCGGCTC GCGGGCAGCG GTCACGGCCG 
CGCATGTTGC GGAACCAGTA CTCGTCGTCG AGCGGCGCGT 
GTGGAGAACC ACGGGATCTC GGGCGTGCGC GAGGTGGTG7 
TCGTCGTACA GCGGGTCGAC GAACGGGGTG TGGGTCGGGC 
ACCCAGACGC CGCGGGCCTC GTAGTCGGCG ATCAGCGTTT 
GCGACGGTCG TGGTGGTGGC GCCGTTGCGG CCCGCGACCC 
GCATCCGCCT CGACGTCGGC GGCCGGGAGC GCGACCGAGC 
AGTTCGCGCA GGAGCAGGAG AACGCTGCGC AGCGCGACGA 
GTGAGCGCTC CGGCGACACA GGCCGCGGCG ATCTCGJCCT 
TCCGGGCGTA CGCCCGCGGC CTCCCACACG GCGGCCAGCG 
ACGGGGTGCA CGACGTCGAC GCGGCGGGTC ACCTCCGGGT 
TCCCAGCCCG TGTGCGGGAT CAGCGCGTCG GCGCATTGGC 
GGGGAGGCCG CCATCAGTTC GACGCCCATG CCGCGCCACT 
ACGAAGACGG TGCGCGGCTC GGTGAGCGCC GTGCCGGTGA 
ACGGCGCGGT GCGGGAACGT CGTACGCCTG GCGAGCAGGC 
TCGTGGCCGG GACGGGCGGC GAGGTGCTCG CGGAGTCGGC 
GTGGCGGTCC GCGCCGAGAC GGGCAGTGGT GTGAGCGGCG 
GGCTTCGAGG CCGACGGCTC CTCGGCCGGC GGCTCCCCGG 
ACGTGGGCGT TGGTGCCGCT GACGCCGAAG GAGGACACAC 
GTCTCGGGCC AGGGCCGGGC ATCGGTGAGG AGTTCGACGG 
TGCGAGGACG GCGTGTCCAC GTGCAGGGTG CGCGGCAGGG 
ACCATCTTGA TGACACCGGC GACACCCGCG GCGGCCTGAG 
AGCGAGCCCA GCJkGCACCGG GGTGTCGCGC CCCTGCCCGT 
GCCTCGATGG GATCGCCCAG CCTGGTGCCG GTGCCGTGCG 
GCCGGGGTGA GCCCGGCGTT GGCCAGGGCC TGCCGGATCA 
TTCGGCGCCG ACAACCCGTT GGAAGCACCG TCCTGGTTGA 
GCCAGCACAC GGTGGCCGTT GCGCTCGGCA TCGGAGAGCC 
GACCCCTCGG CGAAACCGGT GCCGTCAGCC GCATCCJCGA 
GGCGCGAGAC CCCGCTGCTG GGAGAACTCG ACGAAGCCGG 
ACGCCGCCGA CCAGGGCGAG CGAGCATTCG CCGGAGCGCA 
AGCGCCACCA GCGACGACGA ACACGCCGTG TCGACCGTGA 
TAGAAGTACG ACAGCCGACC GGACAGCACA CTGGTCTGGG 
CCCAGGTCGG TGCCGAGTCC GTACCCGTCG GAGAAGGCGC 
CTTCCGCGCA GCGACTCCGG GAGGATCCCG GCGTGTTCCA 
AGGACCAGAC GCTGCTGCGG GTCCATCGCC AGCGCCTCAC 
GCCGCGTCGA AGTCCGCCAC CCCGGCGAGG AAGCCACCAT 
GGATGATCCG GATCGGGATC GTACAGCCCG TCCACGTCCC 
GCCGTGATCC CGTCACCACC CGACTCCAGC AGCCGCCACA 
£CACCCGGCA GCCGGCAGGC CATCCCCACG ATCGCCAACG 
GCGGTCGTGG TGCGGGTCGG CGATGCCGTC CGGCCGGACA 
GCGACGGCGC GCGGCGTCGG GAAGTCGAAG ACCGCGGTGG 
CCCTCGGTGA AGGCGTTGCG CAGCCGGATC GCCATGAGCG 
AACGTGGCGG TCGCCTCGAC CCGTGCGGCA CCGTCGTGGC 
TGCCGGACGA CGGCGAGCAC GTCCTTTTCG GCGTCCGCGG 
TCGGCGAGGG TGGTGGCGCC GGCCGCCCGG CGCCGCGGCT 
AGGGGCGAGC TGCCGAGGCC GGCCGGGTCG GCGGCGACCA 
AACGCCGCGT CGAACAGCGT CAGTCCGCCT TCGGCG JTCA 
CGCATGCGGG CGCCGGTGCC GACCGTCAGC CCGCTCTCCG 
ACGGACAACG CGGGCAGTCC GGCTGCCCGG CGCTGTTCGG 
TTCGCGGCCG CGTAGTTGCC CTGTCCGGGG CTGCCGAGCA 
AGGACGAACG CGGCCAGTTC CGTGTCCTGG GTGAGTTCGT 
ACCTTCGGGC GCAGCACCGT CTCGAGCCGG TCGGGGGTGA 
TCGAGGACGG CCGCGGTGTG CACGACGGCC GTGAGCGGGT 
AGTACGGAGG CGAGTTCGTC CCGGTCGGCG ACGTCGCAGG 
CCGGGCACGT CGCTCGCCGT GCCGCTGCGC GACAGCATCA 
CGTTCGACGA GGTGGCGGCT GATGATGCCG GCCAGCGTCC 
ACGGTGCCGT CCGGGTCGAG CGCCGGAGCG TCACCCGCCG 
CGGGCGTACA CC7GGCCGTC ACGCAGCACC ACCTGGGGCT 
GCGAGCAGCG GCTCGGCGGT GTCCGGGGCG GCGTCGACGA 
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2 9881 GGACGATCCG GCCGGGGTGT TCGGCCTGCG CGGTCCGCAC CAGTCCGGCG GCCGCGGCCG 

2 994 1 ACGCGAGACC GGGCCCGGTG TGGACGGCCA GGACCGCGTC GGCGTACCGG TCGTCGGTGA 

30001 GGAAGCGCTG CACGGCGGTC AGGACGCCGG CGCCCAGTTC GCGGGTGTCG TCGAGCGGGG 

30061 CACCGCCGCC GCCGTGCGCG GGGAGGATCA CCACGTCCGG GACCGTCGGG TCGTCGAGGC 

30121 GGCCGGTCGT CGCGGTCGTG GGCGGCAGCT CCGGGAGCTC GGCCAGCACC GGGCGCAGCA 

30181 GGCCCGGAAC GGCTCCCGTG ATCGTCAGGG GGCGCCTGCG CACGGCGCCG ATGGTGGCGA 

302 4 1 CGGGCCCGCC GGTCTCGTCC GCGAGGTGTA CGCCGTCAGC GGTGACGGCG ACGCGTACCG 
30301 CCGTGGCGCC GGTGGCGTGG ACGCGGACGT CGTCGAACGC GTACGGAAGG TGGTCCCCTT 

303 61 CCGCGGCGAG GCGGAGTGCG GCGCCGAGCA GCGCCGGGTG CAGGCCGTAC CGTCCGGCGT 

304 21 CGGCGAGCTG TCCGTCGGCG AGGGCCACTT CCGCCCAGAC GGCGTCGTCG TCGGCCCAGA 

304 81 CGGCGCGCGG GCGGGGCAGC GCGGGCCCGT CCGTGTACCC GGCTCGGGCC AGACGGTCGG 

305 4 1 CGATGTCGTC GGGGTCCACC GGCCGGGCCG TGGCGGGCGG CCACGTCGAC GGCATCTCCC 
30601 GCACGGCCGG GGCCGTCCGC GGG1CGGGGG CGAGGATTCC GTGCGCGTGC TCGGTCCACT 
30661 CCCCCGCCGC GTGCCGCGTG TGCACGGTGA CCGCGCGGCG GCCGTCCGCC CCGGGCGCCC 

15 307 21 TCACCGTGAC GGAGAGCGCG AGCGCACCGG ACCGCGGCAG CGTGAGGGGG GTGTCCmCGG " 

307 81 TGAACGTGTC GAGGGCGCCG CAGCCGGCTT CGTCGCCCGC CCGGATCGCC AGAT.CCAGGA 

30841 GGGCCGCGGC GGGCAGCACC GCGAGGCCGT GCAGGGAGTG CGCCAGCGGA TCGGCGGCGT 

30901 CGACCCGGCC GGTGAGCACC AGGTCGCCGG TGCCGGGCAG GGTGACCGCC GCGGTCAGCG 

30961 CCGGGTGCGC GACCGGCGTC TGTCCGGCCG GGGCCGCGTC GCCCGCGGTC TGGGTGCCGA 

20 31021 GCCAGTAGCG GACCCGCTCG AACGGGTACG TCGGCGGGTG CGAGGCGCGT GCCGGCGCGG 

31081 GGTCGATGAC CTTCGGCCAG TCGACCGTGA CGCCGTCGGT GTGCAGCCGG GCGAGCGCGG 

31141 TCAGGGCGGA TCGCGG7TCG TCGTCGGCGT GCAGCATCGG GATGCCGTCG ACGAGTCGGG 

31201 TCAGGCTCCG GTCCGGGCCG ATCTCCAGGA GCACCGCCCC GTCGTGCGCG GCGACCTGTT 

31261 CCCCGAACCG GACGGTGTGG CGGACCTGTC GTACCCAGTA CTCCGGCGTG GTGCAGGCGG 

25 31321 CGCCCGCGGC CATCGGGATC CTCGGCTCGT GGTACGTCAG GCTCTCCGCG ACCTTGCGGA 

31381 ACTCCTCGAG CATCGGCTCC ATCCGCGCCG AGTGGAACGC GTGGCTGGTC CGCAGGCGGG 

314 41 TGAAGCGGCC GAGCCGGGCC GCGACGTCGA GCACCGCCTC CTCGTCACCG GAGAGCACGA 

31501 TCGACGCGGG CCCGTTGACC GCGGCGATCT CCACGCCGTC CCGCAGCAGC GGCAGCGCGT 

31561 CCCGTTCCGA CGCGATCACG GCGGCCATCG CCCCGCCGGA CGGCAGCGCC TGCATCAGGC 

30 31621 GGGCCCGTGC GGACACCAGC CTGCACGCGT CCTCCAGGGA CCAGACGCCG GCGACGTACG 

31681 CGGCGGCCAG CTCGCCGATC GAATGGCCCA CGAAGGCGTC CGGGCGTACG CCCCACGCCT 

3174 1 CGAGCTGTGC GCCGAGTGCG ACCTGGAGCG CGAACACCGC GGGCTGGGCG TACCCGGTGT 

31801 CGTGGAGGTC GAGCCCGGCG GGCACGTCGA GGGCGTCCAG CACCTCGCGG CGAGTGCGCG 

318 61 CGAAGACGTC GTAGGCGGCG GCCAGTCCGT CGCCCATGCC GGGACGTTGT GAGCCCi'GTC 

35 31921 CGGAGAAGAG CCACACGAGG CGGCGGTCCG GTTCTGCGGC GCCGGTGACC GTGTCGGTGC 

31981 CGATCAGCGC GGCCCGGTGC GGGAAGGCCG TGCGGGCGAG CAGGGCCGCG GCCACCGCGC 

32041 GCTCGTCCTC CTCGCCGGTG GCGAGGTGGG CGCGCAGGCG GTGTACCTGT GCGTCGAGTG 

32101 CCTGCGGGGT GCGTGCCGAG AGCAGCAGGG GCAGCGGTCC GGTGTCGGGT GCCGGGGCGG 

32161 GTTCGGGGGC CGGTCGGGGG TGGCTTTCGA GGATGATGTG AGCGTTGGTG CCGCTAACGC 

40 32221 CGAAGGAGGA CACCCCGGCG CGCCGTGGGC GGTCGGTTTC GGGCCAGGGG CGGGCGTCGG 

32281 TGAGGAGTTC GACGGCGCCG GCCGTCCAGT CGACGTGCGA GGACGGCGTG TCCACGTGCA 

32 34 1 GGGTGCGCGG CAGGGTGCCG TGCCGCATGG CGAGGACCAT CTTGATGACA CCGGCGACGC 

324 01 CCGCGGCGGC CTGAGTGTGG CCGATGTTGG ACTTCAGCGA GCCCAGCAGC ACCGGGGTGT 

32 4 61 CGCGATGCTG CCCGTAGGTG GCCAGTACCG CCTGCGCCTC GATGGGGTCG CCCAGCCTGG 

45 32 521 TC'CCGGTGCC ATGCGCCTCG ACAGCGTCCA CATCCGCCGG GGTGAGCCCG GCGTTGGCCA 

32581 GCGCCTGCCG GATCACCCGC TCCTGCGACG GCCCGTTCGG CGCCGACAAC CCGTTGGAAG 

32 641 CACCGTCCTG GTTGACCGCC GAACCACGCA CGACCGCCAG GACATTGTGG CCGTGCCGCT 

32701 CGGCGTCGGA GAGCCTCTCG ACGATCAGCA CACCGGATCC CTCGGCGAAA CCGGTGCCAT 

327 61 CAGCCGCATC CGCGAACGCC TTGCAGCGGC CGTCCGGGGA GAGGCCCCGC TGCTGGGAGA 

50 32 821 AGTCCACGAA GCCGGACGGC GAGGCCATCA CCGTGACGCC GCCGACCACG GCGAGCGAGC 

32 881 ACTCCCCCGA GCGCAGCGAC TGCCCGGCCT GGTGCAGCGC CACCAGCGAC GACGAACACG 

3294 1 CCGTGTCCAC CGTGACCGCC GGACCCTCCA AACCGTAGAA GTACGACAGC CGACCGGACA 

3 3001 GCACACTGGT CTGGGTGCTG GTGGCACCGA AACCGCCGCG GTCGGCTCCA GTGCCGi'ACC 

3 3061 CGTAGAAGTA GCCGCCCATG AACACGCCGG TGTCGCTTCC GCGCAGCGAC TCCGGGAGGA 

55 3 3121 TCCCGGCGTG TTCCAGCGCC TCCCACGAGG TCTCCAGGAC CAGACGCTGC TGCGGGTCCA 

3 3181 TCGCCAGCGC CTCACGCGGA CTGATCCCGA AGAACGCCGC GTCGAAGTCC GCCACCCCGG 

33241 CGAGGAAGCC ACCATGACGC ACGGTCGACG TGCCCGGATG ATCCGGATCG GGATCGTACA 

3 33.01 GCCCGTCCAC GTCCCAACCA CGGTCCGTCG GAAACGCCGT GATCCCGTCA CCACCCGACT 

3 3361 CCAGCAGCCG CCACAAGTCC TCCGGCGACG CGACCCCACC CGGCAGCCGG CAGGCCATCC 

60 3 3421 CCACGATCGC CAACGGCTCG TCCTGCCGGA CGGCCGCGGT CGGGGTACGC CGCCGGGTGG 

3 3481 TGGCCCGCGC GCCGGCCAGT TCGTCCAGGT GGGCGGCGAG CGCCTGCGCC GTGGGGTGGT 

3354 1 CGAAGACGAG CGTAGCGGGC AGCGTCAGGC -CCGTCGCGTC GGCCAGCCGG TTGCGCAGTT 

3 3 601 CGACGCCGGT CAGCGAGTCG AAGCCCACTT CCCTGAACGC GCGCGCGGGT GCGATGGCGT 

3 3661 GGGCGTCGCG GTGGCCGAGC ACCGCGGCAG CGCTGGTACG GACGAGGTCG AGCATGTCGC 
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3 3721 GCGCGGCCCG AGGTGCGGAC 
33781 GGACCCGGTC GGACGCGGCG 
338 4 1 GGTCGGTGTG CAGGGCCGCG 
3 3 901 TGCCGTTGCG GGCGATGCGG 
5 33 961 CCGCGTCCCA CAGTCCCCAG 
34 021 GGGCGAGCGC GTCGAGGAAC 
34 081 ACGTGGCGGA TATGGACGAG 
34 14 1 CGTGCAGGTG CCAGGCGACG 
34201 GCATGGTCGT CACGGCCGCG 

10 34 2 61 GCTGGGCGAC GTCGGCGACG 
34 321 CGTACCGCAC GCGGTCGTCC 
34 381 CGACCTCGGC GGCCTCGTGC 
344 4 1 CGGTGCCGCC GGTGACGAGG 
34 501 CGACACGGCG CAGACGGGCC 

15 34 561 CGCCGGCGGC GAGCCCGGCC 
34 621 CGACGCGGCC GGGATGCTCC 
34 681 CGGGATCGCC GGTACGGGTG 
34 741 GCCAGGTCTG CACGGTGGTG 
34 801 AGGTGCCCGG GTCGCCGGGT 

20 34 8 61 GCACGTCGGC GAGGTACGTC 
34 921 TCTCGAACAG CGCCTCGGCA 

34 981 GGACCGGTGA GCCGTGCTCG 
3504 1 CCAGCAGCAC GCGCAGCGCG 
35101 ACGCCAGCCG GCGCCGCTCC 

25 35161 CGAGCAGCAC GGGGTGCAGC 
35221 ACGCGTAGGC GCGGCCCTCC 
35281 ACGAGAGCGG CAGCGCGTCG 

35 341 GCCAGTCCAC GGGCTCCGCC 
35401 GCGCCCAGGG GCCCGTGCCG 

30 354 61 CGGTTCCGAC GGTGGCCTGG 
35521 CGATGGTCAG CTCCGCGATC 
35581 CCACGAGCGC CGAGCCGGGC 
3 5 641 CCTGACGGCG TACCGAGACA 
35701 CCCACGAGCC GAGCAGCGGG 

35 3 5761 GGTCACGGCG GAACGGGTAC 
35821 TGACGGGCAC GCCCCGGACC 
35 881 CCTCGCCTCG CCGCAGTGTG 
35941 CCAGTGCGGT GGTGAGCACG 
36001 CCGCCAGGTG CCCGGTCGCG 

40 36061 AGGCGGCGTC CGCGGGCCGG 
36121 CCGGCGTGCG CGGAGTGATG 
36181 CATGCGCGGT GTGCGACGCG 
3624 1 GCAGCTCCTC CACGGCGTCG 
36301 CGGCGACCTC CAGGCGCCCG 

45 36361 CCATGCCGCC CTGCCCGGCC 
36421 TCGCGGCGTC GTCCAGGGTG 
364 81 AGTGGCCGAC GACCGCGGCC 
36541 CCATCACCGC GAACGACGCG 
36601 GCCGCTGGGC GATGACGTCC 

50 3 6661 ACTCGCGGAG CCGCCGGGCG 
36721 CCCACTGGGA GCCCTGCCCG 
36781 TTCCCGTCAC GGCCCCCGGC 
3684 1 GCACGACCGC CCGGTGGCGC 
36901 CCGCGGCGCC AGTGAGCGGG 

55 3 6 961 GGGCCGACAT CGGCCAGACC 
37 021 GTGCGGGCGC GGCGGGGGGC 
37 081 CGAACGACGA GACACCCGCA 
3714 1 GCAGCAGCCG GATGTCGCCG 
37201 TGCGCGGCAG GACGCCGTGC 

60 37 2 61 CCGCGGCCTG GGTGTGGCCG 
37 321 GTTCGCGCCC GTAGGCCACT 
37 381 CGGTGCCGTG TGCCTCCACG 
374 4 1 CACGCTGGAT GACGCGCTGC 
37 501 CGTCGGAGTT GACCGCGGAG 
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GTGCGCCGGA CGGCCGGCAC GAGGGTGCGT AGGACCGGCG 
ACGGCGGCGA GGTCGAGCCG GATCGGCACG AGCGCGGGCC 
TCGAACAGGG CGAGCCCCTG TGCGGCCGTC ATCGGGGTCA 
GCCAGGTCGG TGGCGGTCAG CCGCCCGCCC ATCCCGTCCG 
GCGAGCGAGA CGGCGGGCAG CCCCTGGTGG TGCCGGTGGC 
GCGTTGCCGG TCGCGTAGTT GGCCTGACCC GCGCCGCCGA 
TACAGGACGA ACGCGGCCAG GTCGAGATCG CGCGTCAGCT 
TCCGCCTTGA CCCGCAGCAC GGCGTCCCAC TGCTCCGGCC 
TCGTCGACGA TCCCGGCCAT GTGCACGACG GCGCGCAGCC 
ACTGCGGCCA GCTCGTCGCG GTCGACGACG TCGGCGGCCA 
TCCGGCGTGT CGCCGGGCCG GCCGTTGCGG GACACCACGA 
ACGGTGAGCA GGTGGTCCAC GAGGAGGCGG CCGAGCCCGC 
ACGGTCCCGC CGGTCAGCGG GGAGGTTCCG GTGGCCGCGG 
GCACGCGCTG TGCCGTCGGC GACCCGGACG TGCGGCTCGT 
GCTATGGCGG CGGGCGTGAT CTCGTCCGCT TCGATCAGCG 
GTCTCCGCCG TCCGGACCAG GCCGCCGAGC GCTTCC i'GCG 
GCCACGATGA GCCGGGATCG CGCCCAGCGC GGCTCGGCGA 
AGCAGGTCGC GGCCCAGCTC CCGGGTCCGG GCGCCGGGCG 
TCCACGGCCA GGACCACGAC CGGGGGGTGC TCGCCGTCGG 
CA'GTCGGGGA CGGGTGACGC GGGCACGGGC ACCCAGGCGA 
TCGGGGTCGG CGGCCCGCAC GGTCAGGCTG TCGACGTCAA 
TCCGTGGCGA CGATGCGGAC CATGTCGGGG CCGACGCGTT 
GTCGCGGCGC GCGCGTGGAT CCTCACGCCG GACCAGGAGA 
GGGTCCGTGA AGACCGTCCC GAGGGCGTGC AGGGCCGCGT 
CCGTACCGGG CGTCGGTGAG CTGTTCGGCG AGGCGGACCG 
CCCGTCCACA TCGCGGTCAT GGCCCGGAAC GCGGGCCCGT 
TAGAAGCCGG TCAGGTCGGC CGGGTCGGCG TCGGCGGGCG 
GGACCGCCAG TGTCCACGCT CAGCGCTCCG GTCGCACTGA 
GTACGGCTGT GCAGACTCAC CGACCGCCGT CCGGACACCT 
ATCTCCGTGT CGCCGTCGCC GTCGACCACC ACCGGCGCGA 
TCCGGCGTGC CGAGCCGGGC TCCCGCTTCG GCGAGCAGTT 
ACGATGACCC GGCCGTCCAC CTCGTGGTCG GCGAGCCAGG 
CCGCGGTGGC CAGCGCGCCC TCGCCGTCGG GCGAGGTCGA 
TGGCCGGACG TTCCCGCCGG TTCCGCGTCG ATCCAGTACC 
GTGGGCAGCG GCACCACCCG ACGCGTCGCG AACGACCAGG 
CAGAGCGCGG CGAGCGACCG AGTGAAGCGG TCCAGGCCGC 
CCGGTGACGA CCGTATGCGC ATGCCCGGCG AGCGTGTCCT 
GGATGCGCGC TGACCTCGAC GAACGCGCGG TATCCGCGGT 
GCGGCGAACC GAACGGTGCG GCGCAGGTTG TCGTACCAGT 
TCCAGCCACG CCTCGTCCAC GGTGGAGAAG AACGGGACGT 
CCGGCGAGAG CGTCGAGCAG CGCGCCGCGG ATCGTTTCGA 
TAGTCGACGG CGATCCGGCG GGCGCGGGGG GTGGCGGCCA 
GCCGCACCGG CGACAACGAT CGACGCGGGT CCGTTGACCG 
GCCCACACGG CGGCGTCGAA GTCGGCGGGC GGCACCGAGA 
AGTTCGGTGG CGACGAGTCG GCTGCGCACC GCGACGACCT 
AGCACCCCGG CGACGCAGGC CGCGGCGACT TCGCCCTGGG 
GGGGCGACCC CGTGCGCACG CCACAGCTCC GCCAGCGCCA 
GGCTGCACGA CATCGACCCG GTCGAACGCG GGCGCTCCGG 
AGCAGGTCCC ATCCGGTGTG CGGGGCGhGC GCCGTGGCGC 
AACACGGGCT CGGTGGCGAG CAGTTCGGCA CCCATGCCGG 
GGGAACGCGA ACACGACACG TGTGTCGGTG ACGTCGGCGG 
ACTTCGGCAC CACGGGCGAA CGCCTCCGCC TCTCGGGCCG 
ATGGCCGTCC GGGTGGTGGC GAGCGAGTGG CCGACCGCOG 
GCCAGCTGTC CCGCGACGTC CCGCAGTCCC TCCGGGoTCC 
ACGTCCTCGG GCACCGGCTC GGCTTCGGGT GCGGACACGG 
CCGGCCTCCA GGACGACATG GGCGTTGGTG CCGCTGATGC 
CGCCGGGCGC GCCCGGTGAC CGGCCACGGC TCACTGCGGT 
TCCCAGTCGA CGTGCCGGGA CGGCTCGTCG ACGTGCAGCG 
CGCATCGCCA TGACCATCTT GATGACGCCG GCGACGCCGG 
ATGTTCGACT TGAGCGAGCC GATCAGCAGC GGATGCACGC 
TGCAGGGCCT GGGCCTCGAC GGGGTCGCCG AGACGGGTGC 
GCGTCGACGT CACCCGGCGC CAGGCCGGCG TCGGCGAGCG 
TGCGCAGGCC CGTTCGGGGC GGACAGCCCG TTCGACGCGC 
CCGCGCACCA GCGCCAGCAC GGGGTGGCCG TGGCGGGTGG 
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37 561 CGTCGGAGAG CCGCTCCAGC ACCAGGACAC CGGCGCCCTC GGCGAAGCTC GTGCCGTCCG 
37 621 CGGTGTCCGC GAAGGCCTTG GCACGGCCGT CGGGGGCGAG CCCGCGCTGC CGGGAGAACT 
37 681 CGACGAACCC GGTCGTCGTC GCCATCACCG TGACACCGCC GACCAGGGCG AGCGAGCACT 

377 4 1 CCCCCGAGCG CAGCG ACCGC GCGGCCTGGT GCAGCGCCAC CAGCGACGAC GAACACGCCG 
5 37 801 TGTCGACGGT GACCGACGGG CCCTCCAGAC CGAAGTAGTA CGAGAGCCGC CCGGAGAGAA 

378 61 CGCTGGTCGG CGTGCCGGTC GCCCCGAAAC CGCCCAGGTC CACGCCCGCG CCGTAGCCCT 

37 921 GGGTGAACGC GCCCATGAAT ACGCCGGTGT CGCTGCCGCG GACGCTTTCG GGCAGGATGC 
37981 CCGCTCGTTC GAACGCCTCC CACGACGCTT CGAGGACCAG ACGCTGCTGC GGGTCCATCG 

38 04 1 CCAGCGCCTC ACGCGGGCTG ATCCCGAAGA ACGCGGCGTC GAAGTCGGCG GCGCCGGTGA 
10 38101 GGAAGCCGCC GTGACGCACG GAAACCTTGC CGACCGCGTC GGGGTTCGGG TCGTAGAGCG 

38161 CGGCGAGGTC CCAGCCGCGG TCGGCGGGGA ACTCGGTGAT CGCGTCCCCG CCGGAGTCGA 
38221 CCAGCCGCCA CAGGTCCTCC GGTGACCGCA CGCCACCGGG CATCCGGCAC GCCATGGCCA 
38 281 CGATCGCCAG CGGCTCGTTC CCCGCCACCG TCGGTGCGGG CACTGTCGCC GCCGGAGCGG 
38 341 CAGGGGCCGG CTCACCCCGC CGTTCCTCAT CCAGGCGGGC GGCGAGCGCG GCCGGTGTCG 

15 38401 GGTGGTCGAA GACGGCCGTC GCGGAGAGCC GTACCCCCGT CGTCTCGGCG AGGCTGTTGC 
384 61 GCAACCGGAC ACCGCTGAGC GAGTCGATGC CGAGGTCCTT GAACGCCGTC GTGGGCGTGA 
38 521 TCTCGGAGGC GTCGGCGTGG CCGAGCACGG CGGCCGTGGC CGCACACACG ATGGCCAGCA 
38581 GGTCACGATC GCGGTCGCGG TCGCGGTCGC GGTTGTCCTC CGCACGGGCG GCGATGCGGC 
38 641 GCTCGGTCCG CTGCCGGACG GGCTCGGTGG GAATCGCCGC GACCATGAAC GGCACGTCCG 

20 38701 CGGCGAGGCT CGCGTCGATG AAGTGGGTGC CCTCGGCCTC GGTGAGCGGC CGGAACCCGT 
387 61 CGCGCACCCG CTGCCGGTCG GCGTCGTCAA GTTGTCCGGT GAGGGTGCTG GTGGTGTGCC 
38821 ACATGCCCCA GGCGATGGAG GTGGCGGGTT GGCCGAGGGT GTGGCGGTGG GTGGCGAGGG 
38881 CGTCGAGGAA GGCGTTGGCG GCGGCGTAGT TTCCTTGTCC GGGGCTGCCG AGGACGGCGG 
38 94 1 CGGCGCTGGA GTAGAGGACG AAGTGGGTGA GGGGTTGGTT TTGGGTGAGG TGGTGCAGGT 

25 39001 GCCAGGCGGC GTTGGCTTTG GGGTGGAGGA CGGTGGTGAG GCGGTCGGGG GTGAGGGCGT 
39061 CGAGGATGCC GTCGTCGAGG GTGGCGGCGG TGTGGAAGAC GGCGGTGAGG GGTTGGGGGA 
39121 TGTGGGCGAG GGTGGTGGCG AGTTGGTGGG GGTCGCCGAC GTCGCAGGGG AGGTGGGTGC 
39181 CGGGGGTGGT GTCGGGGGGT GGGGTGCGGG AGAGGAGGTA GGTGTGGGGG TGGTTCAGGT 
3924 1 GGCGGGCGAG GATGCCGGCG AGGGTGCCGG AGCCGCCGGT GATGATGATG" GCGTGTTCGG 

30 39301 GGTTGAGGGG GGTGGTGGTG GGTGGGGTGG TGGTGTGGAG GGGGGTGAGG TGGGGTCGGT 
39361 GGAGGGTGTG GTGGGTGAGG CGGAGGTGGG GGTGGTCGAG GGTGGCGAGT TGGGCCAGGG 
39421 GGAGGGGAGT GTGGGGGTGG TCGGTTTCGA TGAGGCGGAT GCGGTGGGGG TGTTCGTTCT 
39481 GGGCGGTGCG GGTGAGGCCG GTGACGGTGG CGCCGGCGGG GTCGGTGGTG GTGTGGACGA 
3954 1 TGAGGGTGTG GTCGGTGGTG GTGAGGTGGT GTTGCAGGGC GGTCAGGACG CGGGTGGCGC 

35 39601 GGGTGTGGGC GCGGGTGGGT ATGTCCTCGG GGTCGTCGGG GTGGGCGGCG GTGATCAGGA 
39661 CGTGTCCCTC GGGCAGGTCA CCGTCGTAGA CCGCCTCGGC GACCGCGAGC CACTCCAACC 
39721 GGAGCGGGTT CGGCCCCGAC GGGGTGTCGG CCCGCTCCCT CAGCACCAGC GAGTCCACCG 
39781 ACACGACAGG ACGGCCATCC GGGTCGGCCP, CGCGCACGGC GACGCCGGCC TCCCCCCGGG 
3984 1 TGAGGGCGAC GCGCACCGCG GCGGCCCCGG TGGCGTTCAG GCGCACGCCC GTCCAGGAGA 

40 39901 ACGGCAGCTC GATCCCGCCG CCCGCGTCGA GGCGCCCGGC GTGCAGGGCC GCGTCGAGCA 
39961 GTGCCGGATG CACACCGAAA CCGTCCGCCT CGGCGGCCTG CTCGTCGGGC AGCGCCACCT 
4 0021 CGGCATACAC GGTGTCACCA TCACGCCAGG CAGCCCGCAA CCCCTGGAAC GCCGACCCGT 
4 0081 ACTCATAACC GGCATCCCGC AGTTCGTCAT AGAACCCCGA GACGTCGACG GCCGCGGCCG 
4 0141 TGGCCGGCGG CCACTGCGAG AACGGCTCAC CGGAAGCGTT GGAGGTATCC GGGGTGTCGG 

45 4 0201 CGGTCAGGGT GCCGCTGGCG TGCCGGGTCC AGCTGCCCGT GCCCTCGGTA CGCGCGTGGA 
4 0261 CGGTCACCGG CCGCCGTCCG GCCTCATCGG CCCCTTCCAC GGTCACCGAC ACATCCACCG 
4 0321 CTGCGGTCAC CGGCACCACG AGCGGGGATT CGATGACCAG TTCATCCACC ACCCCGCAAC 
4 0381 CGGTCTCGTC ACCGGCCCGG ATGACCAGCT CCACAAACGC CGTACCCGGC AGCAGAACCG 
4 0441 TGCCCCGCAC CGCGTGATCA GCCAGCCAGG GATGCGTACG CAATGAGATC CGGCCGGTGA 

50 4 0501 GAACAACACC ACCACCGTCG TCGGCGGGCA GTGCTGTGAC GGCGGCCAGC ATCGGATGCG 
4 0561 CCGCCCCGGT CAGCCCGGCC GCGGACAGGT CGGTGGCACC GGCCGCCTCC AGCCAGTACC 
4 0621 GCCTGTGCTC GAACGCGTAG GTGGGCAGAT CCAGCAGCCG CCCCGGCACC GGTTCGACCA 
4 0681 CCGTGCCCCA GTCCACCCCC GCACCCAGAG TCCACGCCTG CGCCAACGCC CCCAGCCACC 
4 0741 GCTCCCAGCC ACCGTCACCA GTCCGCAACG ACGCCACCGT 'GCGGGCCTGT TCCATCGCCG 

55 4 0801 GCAGCAGCAC CGGATGGGCA CTGCACTCCA CGAACACCGA CCCGTCCAGC TCCGCCACCG 
4 0861 CCGCATCCAG CGCGACAGGG CGACGCAGGT TCCGGTACCA GTACCCCTCA TCCACCGGCT 
4 0921 CGGTCACCCA GGCGCTGTCC ACGGTCGACC ACCACGCCAC CGACCCGGTC CCGCCGGAAA 
40981 TTCCCTTCAG TACCTCAGCG AGTTCGTCCT CGATGGCCTC CACGTGAGGC GTGTGGGAGG 
41041 CGTAGTCGAC CGCGATACGA CGCACCCGCA CCCCATCAGC CTCATACCGC GCCACCACCT 
'60 4 1101 CCTCCACCGC CGACGGGTCC CCCGCCACCA CCGTCGAAGC CGGACCATTA CGCGCCGCGA 
4 1161 TCCACACACC CTCGACCAGA CCCACCTCAC CGGCCGGCAA CGCCACCGAA GCCATCGCCC 
4 122i CCCGGCCGGC CAGCCGCGCC GCGATCACCC GACTGCGCAA CGGCACCACG CGGGCGGCGT 
41281 CCTCCAGGCT GAGGGCTCCG GCCACACACG CCGCCGCGAT CTCCCCCTGC GAGTGTCCGA 
4 134 1 CCACAGCGTC CGGCACGACC CCATGCGCCT GCCACAGCGC GGCCAGGCTC ACCGCGACCG 
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41401 CCCAGCTGGC CGGCTGGACC 
4 14 61 CCCGCACATC CCAGCCCGTG 
4 1521 CGAACACCGC GGAACGGTCC 
4 1581 CGGGGAAGAC GAACACCGTA 
5 4 1641 CCAGCAGCAC CGCACGGTGA 
4 1701 CGGCCACATC CACCCCACCC 
4 1761 GACTCACCTC ACCACGAGCC 
4 18 21 CACGCGACGG CCCAGGAACA 
4 1881 ACGACGACAC ACCCGCATGC 

10 4 1941 GCAGCTCCAC CGCACCGGCC 
4 2001 TCTTCGGCGC GATCCCATGC 
4 2061 CAGGGGCCTG CGCATGACCG 
4 2121 GGTCCTGCCC GTAGGCCGCG 
42181 CGGTGCCGTG CGCCTCCACC 

15 4 2241 CCTGCCGGAT CACGCGCTGC 
4 2 301 CGTCCTGGTT CACCGCCGAG 
4 2361 CGTCGGAGAG CCGCTCCAGC 
4 2421 CCGCGTCGGC GAACGCCTTG 
4 2481 CCACGAGCTC TGCGGTGTTC 

20 4 2541 CCCCGGCCCG CAGTGCCTGT 
4 2601 TGTCGACCGT GACCGCCGGG 
42661 CGCTCGTCTG CGTCGCCGTG 
4 2721 GGTTGAACGC GCCCATGAAC 
4 27 81. CGGCGTTCTC GAACGCCTCC 

25 4 2841 CCAGCGCCTC GTTCGGACTG 
4 2 901 ATCCGCCGTG GCGTGTCGTG 
4 2961 CGACGTCCCA GCCCCGGTCG 
4 3021 GCCGCCACAG GTCCTCCGGC 
4 3081 TCGCGACGGG GTCGCCGGAG 

30 4 3141 CGGCGAGGTG GGCGGCGAAC 
4 3201 CCCGCAGACC CGTCCGCGCG 
4 3261 GGCCGTTCTC GCGGAACGTG 
4 3321 CGGTGGCGAC GCTGTCGCGG 
4 3381 CGGCGAGGCG GTTCGCCCAC 

35 4 3441 CGGTGAGGAT CGGCGGCGTG 
4 3501 TCCGGGCCAC GATGTACGAG 
4 3561 GCGCCGGCCG TTCGATGCCG 
4 3621 CCCGTGGCCG GGTGTGGGCG 
4 3681 CGCCGGGGTT CGCGGCTTCC 

40 4 3741 GGAGCAGGCC GGCGACGGTG 
4 3801 CGATCGGAGG CGGCACGGTG 
4 3861 CGAACGCGTC CCGCGCACGG 
4 3921 CGCGGTCGAA CAGGTCGAGG 
4 3981 CGGCCAGGTC GAACGGCTGC 

45 4 4 041 ACCGGCCGCC CGGTGCGAGC 
4 4101 TGAGCACGAC GTCGACCGGC 
4 4 161 CATGGTGGGT GTCGAAGCCG 
4 4 221 CGTACACCTC GGCGCCGAGG 
4 4 281 TCGCGGCGTG GACCAGGACC 

50 4 4 341 ACCAGGCGGT GGCGAACACG 
4 4 401 GGATCCGTGC GACCAGCCGC 
4 44 61 GACCGAACAC GCGGTCGCCG 
4 4 521 TGCCCGCGGC CTCCCCGCCC 
4 4 581 CGTCGCGGAA GTTCAGCCCC 

55 4 4 641 GCGCGGCGGG ACGTCGAGCG 
4 4 701 GCGCAGCGCC CACTGGCGCG 
4 4761 CGTAGGCCAC GCCGGCCCGC 
4 4821 CGACGTCGTC ATCGCCGTCC 
4 4 881 GGCGCAGCGC CTCGTCCCAG 

60 4 4 941 CGCCCACCGC GCGGCGGGTG 
4 5001 GCCGCTCCCA GACCAGTTCG 
4 5061 CCGGCAGCCC CGCGAGCCGC 
4 5121 TGACGTGCCA GATCTCGTCG 
4 5181 GGATCGCCTC GGCGGGGACG 
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ACCTCCACCC GCTCCGCCAC ATCCGACCGC GACAACATCT 
TGCGGCAACA ACGCCCGCGC ACACTCCTCC ATACGAGCCG 
ATGAGTTCCA CGCCCATGCC CACCCACTGG GCACCCTGCC 
CGCGGCTGAT CCACCGCCAC ACCCATCACC CGGGCATCAC 
CCGAAGACAG CACGCTCACG CACCAACCCC TGCGCGACCG 
CCGCGCAGAT ACCCCTCCAG CCGCTCCACC TGCCCCCGCA 
GACACCGGCA ACGGCACCAA CCCATCACCA CCCGACTCCA 
CCCTCCAGGA TCACGTGCGC GTTCGTACCG CTCACCCCGA 
GGTGCCCGAT CCGACTCGGG CCACGGCCTC GCCTCGGTGA 
GACCAGTCCA CATGCGACGA CGGCTCGTCC ACGTGCAGCG 
CGCATCGCCA TGACCATCTT GATGACACCG GCGACACCCG 
ATGTTCGACT TGACCGAACC GAGGTAGAGC GGCGTGTCGC 
AGGACGGCCT GCGCCTCGAT CGGGTCGCCC AGCCGCGTGC 
ACGTCCACAT CGGCGGCGCG CAGTCCGGCG TTGACCAACG 
TGGGCGACGC CGTTGGGGGC GGACAGTCCG TTGGAGGCAC 
CCGCGGACGA CCGCGAGAAC GGTGTGCCCG TTGCGCTCGG 
ACGAGAACGC CGACGCCCTC GGCGAAGCCG GTCCCGTCCG 
CACCGTCCGT CCGGGGAGAG TCCGCGCTGC CGGGAGhACT 
GCCATGACGG TGACACCGCC GACCAGCGCC AGGGAGCACT 
GCCGCCTGGT GCAGGGCGAC CAGCGACGAC GAGCACGCCG 
CCCTGAAGTC CGTACACGTA CGAGAGGCGC CCGGACAGGA 
ACACCGAGCC CGCCCAGGTC CCGGCCGACG CCGTAGCCCT 
ACGCCGGTGT CGCTCTCCCG GAGCCTGTCC GGCACGATGC 
CAGGAGGTCT CCAGGATCAG GCGCTGCTGG GGGTCCATCG 
ATGCCGAAGA ACGCGGCGTC GAACCCGGCG CCGGCCAGGA 
GAGCGGCCGG CCGCGTCCGG GTCCGGGTCG TACAGCGCGT 
GTGGGGAACT CGGTGATCGC CTCGGTACCG GCGGCGACGA 
GAGGCGACCC CGCCGGGCAG TCGGCACGCC ATGCCGACGA 
CCGAGGGTCT GGGCGGTCGC GGGTGCCGCT GTCGCGGAGC 
GCACGCGGAG TGGGGTGGTC GAACGCGGTT GACGCGGGCA 
GCGACGGTGT TGGTGAACTC GACGGTGGTG AGCGAGTCGA 
CGGTCCGGGG AGCAGTGTCC GGCGCCCGGC AGGCCCAGGA 
ACCAGGTCGA GCAGTACGTC CTCCCGGCCC GCACGGGCCG 
TCCTGTTCCG TGGCGTCGGG CTCGGCCGGT CCGGTCAGTG 
GCGCCCGCCA TCGTCGCGGC CCGCGCCCCG GCGGAACCGG 
CCGCCGCCCG CGATGGCCTT CTCGATCAGG TCGCCGGTOA 
GGCAGCGCGC GGACGGTGAC GGTGGGGAGT CCCTCCuCGG 
TCGGCGCCGG CCGGGCCGTC GAGCAGGACG TGCACGAGCG 
TCGGCTGCGG TGGTCACGTG GGTGAGGCCG GTCTCGTCGC 
TCGGCGTCCT CCCCGGTGAC CAGGACCGGC GCGTCCGGGC 
AGGACCATCT TGCCGGTGTG CCGGGCGTGG CTCATCCACG 
CGGATGTCCC ACGGCTGCAC CGGCAGCGGG CACAGCTCAC 
AGCAGTTCGA GGATCTCCCG CAGGCGCGCG GGATCCACGT 
TGGGCGGCGT GGCGGATGTC GGTCTTGCCC ATCTCGACGA 
AGGCCGATGG ACGCGTCGAG GAGTTCACCG GTGAGCGAGT 
GGGAAGGTGT CGGCGAACGC GGCGCTGCGG GAGTTCGCCA 
TCGGCGTGCA GCAGGTGTTG TTTGGCGGGA CTGGCGGTGG 
TGGCGGGCGA TCCGGGTCGC CGCCATGCCG ACACCGCCCG 
TTCTGGCCGG GTCGCAGCTC GCCCGCGTCG ACGAGGCCGT 
ATGGGCACGG ACGCGGCGAT GGGGAACGAC CATCCCCGTG 
CGGTCCGCGA CCACGCTGCG CCGGAACGCG TCCTGCACGA 
GGGGCCAGGT CGTCGACGCC GGGTCCGACT TCGGTCACGA 
ATCTCGCCCT CGCCCGGGTA GGTGCCGAGC GCGATCAGCA' 
GCGGCGCGGA CGTCGATGCG GACCTCGCCG GCGGCCAGGG 
GGGCGACGAC GAGGTCGCGG AGCGTTCCGG AGGCGGGCC-G 
GTCGGCAGGG GGGTGGTGTC CGCGCGTACC AGCCGGoGCA 
AGCGCGATCT GGGGTTCGCC GAGCGAGGCC GCGGCGGGGA 
GTGTCCACCA GCACGAACGA TCCGGGTTCG GCGGCCTGGC 
AGCCGGGCCT GGTCCGCGTC CGGGATCTCG GCCGGGCCGA 
ACGACCGTCC GGCGGGGTGA CGGGGTG'CCG GGCAGGTCGC 
CACAGCGTGG CCTCGCCACT GCCGGTGGCG ACCAGATGGG 
GCGCGCTGGA CCTTGCCCGA CGCGGTGCGG GGGATCGTGG 
GGCACCTTGA AGTAGGCGAG CCGGCGGCGG CACTCGGCGA 
CGGGGGCCGT CGGAAACGAC GTAGAGCACG GGTATGTCGC 
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4 5241 CGAGGACGGG GTGCGGGCGG 
4 5 301 CGACGGTCTC GATCTCCCGG 
4 5 361 CCCGGCCGGT GATCGTCACG 
4 5421 ACCAGCCGTC CACGAGCACC 
5 4 54 81 GGCTCGGCCC GCTCGCCCAC 
4 5541 CCGGGTCGAC GAACCGCAGC 
4 5 6CU GCGCATCCTC CAGGGTGTTG 
4 5 661 CGAGCAGGGG CACGCCGAAC 
4 5721 ATCCGGCGAC CAGCGCCACG 

10 4 5781 GGAGGTAGCG GTACATCGTC 
4 5841 CGTCGAGGAC GTCACGCGCG 
4 5 901 GGACGGCGAG CAGGCAGAGG 
4 5 961 GTTCGTCGTC CTCGGTCAGC 
4 6021 CGCTGCGCTG TGCGGAAACC 

15 4 6081 TCCAGGCGGG TTCGTCCAGG 
4 614 1 CGAGGTCCTC GTAGGAGACG 
4 6201 CGGTGCCGGT GCGGCGCACC 
4 6261 CGGAGTCCGT CAGGAAGTGG 
4 6321 CGACGGCGGC GGCGCGGGCG 

20 4 6381 GCAGCATCGC GACCCGGTCG 
4 64 41. CGCCGGCCCG GAGCCGGAGT 
4 6501 TCCGGTCGCC GCGTCGCTCG 
4 6561 CCACACGCGC CATGGAAACA 
4 6621 AC GAG TAG AC GCCGGCGACG 

25 4 6681 CTACCGTGGC CGGCCTCCCC 
4674 x AATTGCCTTC CTGATGACCG 
4 6801 TGTCACGGCG CCGTATTGCC 
4 6861 GACGGTGCTC GGCCTGATCG 
4 6921 TGCTCCCCGG ACCGCCGTGC 

30 4 6981 GCACGCACAG CGCCCTGTCG 
47041 CGCGTACCTG TTCGGTGTCG 
4 7101 GGCCCTCTAC ACGAACGTCT 
4 7161 GACCTGGAAC TACGTCAGCG 
4 7221 GGACTTCTGC GTGGGCCGCG 

35 4 7281 GCCCGCGGCC ACCGGTATCG 
4 7 341 CCGGGGCGGA GTGCGGATCA 
4 7401 GACGACGTAC GGTCCGCGGC 
4 74 61 GGGGGGCCGG CTGTTCATCT 
4 7 521 CGGTGATGTG ACCGGCCAGT 

40 4 7 581 GGAGAACCTG CGGCGCCACG 
4 7 641 CAAGGTCTAC GTCCGCCGCC 
4 7701 CCTGTCGAGC ACCGCGGCCG 
4 7761 CGTCGAAATC GAAGGCATGG 
4 7821 CTCGGCGGAT CCGCGAAGAG 

45 4 7881 TCGTCCTTCG CACAGCGGCG 
4 7 941 TATAATCTCC CGCTCGTGCA 
4 8001 GCGCTGGCGC TCGTCGTCGC 
4 8061 GGCGAGCCCC TCCAGCGGGT 
4 8121 GGCAGCGAGG AGGACGCCGC 

50 4 8181 GCCACCGGGC CGTTGATCAG 
4 824 1 GCGGTGACCG TGCACCATGT 
4 8 301 CTCGCAGCCC ACTACACGGC 
4 8361 CCGGTGCAGT ACGCCGACTT 
4 8421 GACAGGCGTC TGGCCTACTG 

55 4 84 81- CCCACCGACC CTCCCCGCCC 
4 8 541 CCGCCGGCCG CGCTGGCCAC 
4 8 601 TTCATGACCC TGCTGGCGGC 
4 8 661 GTGCTGGTCG GCACGCCCGT 
4 8721 ATGTTCGTCA ACACGCTCGC 

60 4 8781 CTCCTCGACC GCTGCCGGGC 
4 8841 GAGAACGTCA TCGAACTCGT 
4 8 901 GTGCTGTTGC AGGTGCTGCG 
4 8 961 GAACCGTTCC GCACCGGACG 
4 9021 GAGCCGGGTG GCGCGCTGAC 
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CCCGCCGCGG CGGCGTCCCG GACACCGGCC ACCTCCTGGG 
GGGTGGATGT TCTCCCCGCC GCGGATGATC AGCTCCTTGA 
TGTCCGGTCT CGGCCTGACG TGCGAGGTCC CCGGTGCGGT 
TGGGCGGTCG CCTCCGGCTG GGCGTGGTAG CCGAGCATGA 
AGCTCGCCCT CCTCGCCGGG TGCCACGTCG GCGCCGGAGA 
GACAGGCCCG GCACGGGCAG CCCGCACGAG CCGGGAACCC 
GCGGTGAGCG AGCCGGTCGT CTCGGTGCAG CCGTACGTGT 
GTCGCCTCGA AATCCCTGGT GAGCGACGCC GGCGAGGTGG 
CGCAGCGCGC GAGCCCGCGG CTCGCCGGAC ACGGCGCCGA 
GGCACGCCGA CGAGCACGGT GCTGGAGTGT TCGGCCAGGG 
ACGAAGCCGC CCAGGATACG GGCGGACGCG CCGACCGTGA 
TGGTGGCCGA GGCTGTGGAA CAGCGGGGCG GGCCAGAGCA 
CGCCAGGACG GCACGTCGCA GTGCATCGCG GACCACAGGC 
ACGCCCTTGG GACGGCCGGT GGTGCCGGAG GTGTAGAGCA 
CCGAGGTCGT CGCGGGGCGG GCACGGCGGC TCGGTCCCGG 
CAGTCCGGTG CCCGGCGCCC GACGAGCACG ACGGTGGCGT 
TGGTCGAGGT GGGTTTCGTC GGTGACCAGC ACGGTCGCGC 
GCGAGTTCGG CGTCGGCGGC GTCCGGGTTG AGCGGGACCG 
GCGGCGAGGT AGACCTCGAT GGTCTCGATC CGGTTGCCGA 
CCGCGGTCGA CGCCGGACGC GGCGAGGTGT CCGGCGAGCC 
TGCGTGTACG TCACGGCGCG TTGGGAATCC GTGTAGGCGA 
GCATGGATGC GGAGCAATTC GTGCAACGGC CGGATTGGTT 
CCTTTCTCTC GACCAACCGC ACAACAGCAC GGAACCGGCC 
CTAGCAGCGT TTTCCGGACC GCCACCCCCT GAAGATCCCC 
GGACGCTCAT CTAGGGGGTT GCACGCATAC CGCCGTGCGT 
ATGCCGGACG CCAGGGAAGG GTGGAGGCGT TGTCCATATC 
GCTTCGAGAA GACCGGATCA CCGGACCTCG AGGGTGACGA 
AGCACGGCAC CGGCCACACC GACGTGTCGC TGGTGGACGG 
ACACCACGAC CCGTGACGAC GAGGCGTTCA CCGAGGTCTG 
AGTCCGGCAT GGACAACGGC ATCGCCTGGG CCCGCACCGA 
TGCGCACCGG CGAGAGCGGC AGGTACGCCG ATGCCACCGC 
TCCAGCTCAC CCGGTCGCTG GGGTATCCCC TGCTCGCCCG 
GTATCAACAC GACGAACGCG GACGGGCTGG AGGTGTACCG 
CCCAGGCGCT CGACGAGGGC GGGATCGACC CGGCCACCAT 
GCGCCCACGG GGGCGGCATC ACCTGCGTGT TCCTCGCCGC 
ACATCGAGAA CCCCGCCGTC CTCACGGCCC ACCACTACCC 
CCCCGGTCTT CGCACGGGCC ACCTGGCTGG GCCCGCCGCA 
CCGCGACGGC CGGCATCCTC GGACACCGAA CGGTGCACCA 
GCGAGGTCGC CCTCGACAAC ATGGCCCGGG TCATCGGCGC 
GCGTCCAGCG GGGGCACGTC CTCGCCGACG TGGACCACCT 
GCGAGGATCT CGATACGGTC CGCCGGGTCT GCGCCGCACG 
TCGCCCTTTT GCACACCGAC ATAGCCCGCG AGGATCTGCT 
TGGCGTGACA ATACCCGGTA AAAGGCCCGC GACGCTGCGC 
AAAGAAGAGC GTCACCGCAC AGCGCGGCAG CCCGGTCCTT 
GATCTGGTTT CTCCAGCAAT TGGACCCGGA GAGCAACGCC 
ACGCCTGCGC GGTCTATTGG ACGCGCCGGC CCTGGAGCGT 
GCGCCACGAG GCGTTGCGGA CGGTGTTCGA CACCGCCGAC 
GCTTCCCGCC CCGGAACACC TCCTGCGCCA CGCGCGGGCG 
CCGGCTCGTC CGCGACGAGA TCGCCGCGCC GTTCGACCTC 
GGCCCTGCTG ATCCGCCTCG GTGACGACGA CCACGTTCTC 
CGCCGGCGAC GGCTGGTCGT TCGGGCTCCT CCAACATGAA 
GCTGCGCGAC ACTGCCCGCC CTGCCGAACT GCCGCCGTTG 
CGCCGCCTGG GAGCGGCGCG AACTCACCGG CGCCGGACTG 
GCGCGAGCAA CTCCGGGGCG CCCCGGCGCG GCTCGCCCTC 
GCCGGTCGCC GACGCGGACG CGGGCATGGC CGAGTGGCGG 
CGCGGTCCTC ACGCTCGCGC GCGACTCCGG TGCGTCCGTG 
CTTCCAAGCG GTCCTCGCCC GGCAGGCGGG CACGCGGGAC 
GGCGAACCGT ACGCGGGCGG CGTACGAGGG CCTGATCGGC 
GCTGCGCGGC GACCTCTCGG GCGATCCGTC GTTCCGGGAA 
CACGACCACG GACGCGTTCG CCCACGCCGA CCTGCCGTTC 
CGCACCGGAA CGCGACCTGT CGGTCAACCC GGTCGTCCAG 
GCGCGACGCG GCGACGGCCG CGCTGCCCGG CATCGCGGCC 
CTGGTTCACC CGCTTCGACC TCGAATTCCA TGTGTACGAG 
CGGCGAACTG CTCTACAGCC GTGCGCTGTT CGACGAGCCA 
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4 9081 CGGATCACGG GGTTGCTGGA 
4 914 1 GACGTACGGC TGTCGCGGCT 
4 9201 TCGAACGACA CGGCGCGGGA 
4 9261 GCCGCACGCA CCCCCGGCGC 
5 4 9321 CAGCTGGACC GGCGGGCGAA 
4 9381 GGCGACCTGG TCGGGATCTG 
4 9441 ATCCTCAAGG CGGGCGCCGC 
4 9501 GCGTTCGTGC TGGCCGACGC 
4 9561 CGGTTCCCCG ATGTGCCGCA 

10 4 9621 GACGACACGG CGCCGGACGT 
4 9681 TCCGGGTCGA CCGGCAGGCC 
4 9741 CTGCTCTGGC AGGAGCGCAC 
4 9801 ACGCCCACGT TCGACTACTC 
4 9861 GTCATCCCGC CGGACGAGGT 

15 4 9921 CAGGCGATTA CCCGGATCTA 
4 9981 GATCCGCACA GCGACCAGCT 
5004 1 ATCCTCGACG CGCGGTTGCG 
50101 CACTACGGTC CGGCCGAAAG 
50161 GCGTGGCCCG CCACCGCACC 

20 50221 GACGAGGCGA TGCGGCCGGT 
50281 GGCCTCGCCC GTGGGTACCT 
50341 GATCCGGTCG GCGAGGAGCG 
504 01 GGCGACCTGG AATTCCTCGG 
504 61 GAACCGGGTG AGATCGAGAG 

25 50521 TCCGTGCGCG AGGACCGGCG 
50581 GGCCGGCACG GCGACGACTT 
50641 GCCGCGCTCG TGCCCTCCGC 
507 01 AAGGTGGACC GGCGCGCGCT 
50761 ACGCCCCGCA CCGATGCCGA 

30 50821 CCGCGGGTCG GTGCCGACGA 
50881 CGGGTCGTCT CCCGCATCCG 
50941 GACGGGCGGA CGCCCGCCGC 
51001 CCCCCGATCG CGCCCTCCGC 
51061 ATGCTGCACT CGCACGGCTC 

35 51121 TTCCGGCTGC GCGGGCCACT 
51181 GCGCGCCACG AGCCGCTGCG 
51241 GCTCCGGTGC GCGCCGAGGT 
51301 GTCGCCCACC GGGAGCTGAC 
51361 GTGCTGCTGC CGCTGGGCGC 

40 51421 GGTGACGGAT GGTCCTTCGA 
514 81 CCGGTGTCCT ACACGGACGT 
51541 GAGAACGACC GGGCCTACTG 
51601 GCGGTCCGGC CCGGCGGGGC 
51661 GCCGTCCTGG CGGCACGCCG 

45 51721 CTCGGCGCCT TCGCCCTGGT 
51781 ACGCCGTTCG CGGACCGGGG 
51841 GTCCTCGCGC TGCGCCTCGA 
51901 CTGCACACCG CGATGGTGGG 
51961 GCCGAGGACC CCGCGCTGCC 

50 52021 GCGGAACTGC GGCTGCCCGG 
52081 GACGAGATGA CCGGCGAACT 
5214 1 GCGGTGGTCC ACGATGCCGC 
52201 GTGGAGGCGA CGCTGCGTGC 
522 61 GAAAGCGAGT AGCCATGCCC 

55 52321 CGGAACTCCA GAAGACCCGT 
52 381 GGATGGCCTG CCGGCTGCCC 
5244 1 AGTCCGGTGG CGACGGCATC 
52 501 ACGGTCGCGG CGGCTTCCTC 
52 561 GCCCGCGCGA GGCGCTGGCG 

60 52 621 AGGCGTTCGA GCACGCGGGC 
52 681 TCCTCGGCGC GTTCTTCCAG 

527 41 CG AGCATTCA CACGAGCGTG 
52801 CGGCGGTCAC GGTCGACACG 

528 61 AGTCGCTGCG CTCCGGCGAA 
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GGAGTTCACG GCGGTGCTTC AGGCGGTCAC CGCCGACCCG 
GCCGGCCGGC GACGCGACGG CGGCAGCGCC CGTGGTGCCC 
CCTGCCCGTC GACACGCTGC CGGGCCTGCT GGCCCGGTAC 
CGTGGCCGTC ACCGACCCGC ACATCTCCCT CACCTACGCG 
CCGCCTCGCG CACCTGCTCC GCGCGCGCGG CACCGCCACC 
CGCCGATCGC GGCGCCGACC TGATCGTCGG CATCGTGGGG 
TTATGTGCCG CTGGACCCCG AACATCCTCC GGAGCGCACG 
GCAGCTGACC ACGGTGGTGG CGCACGAGGT CTACCGTTCC 
CGTGGTGGCG TTGGACGACC CGGAGCTGGA CCGGCAGCCG 
CGAGCTGGAC CGGGACAGCC TCGCCTACGC GATCTACACG 
GAAGGCCGTG CTCATGCCGG GTGTCAGCGC CGTCAACCTG 
GATGGGCCGC GAGCCGGCCA GCCGCACCGT CCAGTTCGTG 
GGTGCAGGAG ATCTTTTCCG CGCTGCTGGG CGGCACGCTC 
GCGGTTCGAC CCGCCGGGAC TCGCCCGGTG GATGGACGAA 
CGCGCCGACG GCCGTACTGC GCGCGCTGAT CGAGCACGTC 
CGCCGCCCTG CGGCACCTGT GCCAGGGCGG CGAGGCGCTG 
CGAGCTGTGC CGGCACCGGC CCCACCTGCG CGTGCACAAT 
CCAGCTCATC ACCGGGTACA CGCTGCCCGC CGACCCCGAC 
GATCGGCCCG CCGATCGACA ACACCCGCAT CCATCTGCTC 
TCCGGACGGT ATGCCGGGGC AGCTCTGCGT CGCCGGCGTC 
GGCCCGTCCC GAGCTGACCG CCGAGCGCTG GGTGCCGGGA 
CATGTACCTC ACCGGCGACC TGGCCCGCCG CGCGCCCGAC 
CCGGATCGAC GACCAGGTCA AGATCCGCGG CATCCGCGTC 
CCTGCTCGCC GAGGACGCCC GCGTCACGCA GGCGGCGGTG 
GGGCGAGAAG TTCCTGGCCG CGTACGTCGT ACCGGTGGCC 
CGCCGCGTCG CTGCGCGCGG GACTGGCCGC CCGGCTGCCC 
CGTCGTCCTG GTGGAGCGAC TGCCGAGGAC CACGAGCGGC 
GCCCGACCCG GAGCCGGGCC CGGCGTCGAC CGGGGGGGTT 
GCGGACGGTG TGCCGGATCT TCCAGGAGGT GCTCGACGTC 
CGACTTCTTC ACGCTCGGCG GGCACTCCCT GCTCGCCACC 
CGCCGAGCTG GGTGCCGATG TCCCGCTGCG TACGCTCTTC 
GCTCGCCCGT GCGGCGGACG AGGCCGGCCC GGCCGCCCTG 
GG AG AACGGG CCGGCCCCCC TCACCGCGGC ACAGGAACAG 
GCTGCTCGCC GCGCCCTCCT ACACGGTCGC CCCGTACGGG 
CGACCGCGAA GCGCTCGACG CGGCACTGAC CCGGATCGCC 
GACCGGGTTC CGCGATCGGG AACAGGTCGT GCGGCGGCCC 
GGTTCCGGTG CCGGTCGGCG ACGTCGACGC CGCGGTCCGG 
CCGGCCGTTC GACCTCGTGA ACGGGTCGTT GCTGCGTGCC 
CGAGGATCAC GTGCTGCTGC TGATGCTGCA CCACCTCGCC 
CCTCCTGGTC CGGGAGTTGT CGGGGACGCA ACCGGACCTT 
GGCCCGGTGG GAACGGAGTC CGGCCGTGAT CGCGGCCAGG 
GCGCCGGCGG CTGGGGGGCG CCACCGCGCC GGAGCTGCCC 
ACCGACCGGG CGGGCGTTCC TGTGGACGCT CAAGGACACC 
GGTCGCGGAC GCCCACGACG CGACGTTGCA CGAAACCGTG 
CGTGGCGGAG ACCGCCGACA CCGACGACGT GCTCGTCGCG 
GTACGCCGGG ACCGACCACC TCATCGGCTT CTTCGCGAAG 
CCTCGGCGGC ACGCCGTCGT TCCCCGAGGT GCTGCGCCGG 
CGCGCACGCC CACCAGGCGG TGCCCTACTC CGCGCTGCGC 
GCCGGCCCCC GTGTCGTTCC AGCTCATCAG CGCGCTCAGC 
CATGCACACC GAGCCGTTCC CCGTCGTCGC CGAGACCGTC 
GTCGATCAAC CTCTTCGACG ACGGTCGCAC CGTCTCCGGC 
GCTGCTCGAC CGTGCCACCG TCGACGATTT GCTCACCCGG 
CGCCGCGGGC GACCTCACCG TACGCGTCAC CGGTTACGTG 
GAGCAGGACA AGACAGTCGA GTACCTTCGC TGGGCGACCG 
GCGGAACTCG CCGCGCACAG CGAGCCGTTG GCGATCGTGG 
GGCGGGGTCG CGTCGCCGGA GGACCTGTGG CAGTTGCTGG 
ACCGCGTTCC CCACGGACCG GGGCTGGGAG ACCACCGCCG 
ACCGGGGCGG CCGGCTTCGA CGCGGCGTTC TTCGGCATCA 
ATGGACCCGC AGCAGCGCCT GGCCCTGGAG ACCTCGTGGG 
ATCGATCCGC AGACGCTGCG GGGCAGTGAC ACGGGGGTGT 
GGGTACGGCA TCGGCGCCGA CTTCGACGGT TACGGCACCA 
CTCTCCGGCC GCCTCGCGTA GTTCTACGGT CTGGAGGGTC 
GCGTGTTCGT CGTCGCTGGT GGCGCTGCAC CAGGCCGGGC 
TGCTCGCTCG CCCTGGTCGG CGGCGTCACG GTGATGGCCT 
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CGCCGGCGGG 
AGGCCTTCGC 
TCGAGAAGCT 
CCGCCGTCAA 
AGCGGGTGAT 
TCGAGGCCCA 
CCACCTACGG 
GCCACACCCA 
ACGGCACCCT 
CCGGCGCCGT 
GCGCCGGTGT 
ACCCCCGACC 
TCTCGGCCCG 
ACGACAACCC 
TCGAGCACCG 
GCGGACCGGT 
AACTCGCGTC 
ACCCCACCCA 
GGTCCTGGGG 
CGCACGCCGC 
GCCTGATGGA 
AGGCACGCCA 
TCGTGCTGTC 
ACCGCCTGCC 
TCCTCGACGT 
CCACCACCGC 
CCGAGCAGTA 
TCGTCGACGG 
CGCTCGCGCA 
ACCGCGCGCC 
CCACCTCCCG 
GCGCCGCGGT 
CCTCCCATCC 
CCTTCCTCGA 
TCGTCATCGA 
TCGCCGAACC 
CGGGCCTGTG 
CCACGGACCC 
ACGACCGGTT 
CCTGGCGCGC 
ACGCCGCCCG 
7GGCCGCGCT 
GCATCCACGC 
GCACCGTCCG 
CGCGCCCGTA 
CGATGCCCGT 
ACGGCGACGT 
GCCACCTGTC 
CTGCCGCCGC 
TCGTCGAGGC 
AACCGCAGCT 
ACCCCGCGCA 
CCGGCACGTT 
CCGGCGAGGT 
CGCTCGGGAC 
AGACCGGGCC 
GCGGCATCGG 
GGAGCTTCAC 
TCGACCTCGG 
TCGGCATGGC 
GTACCGGCAA 
CTCGGACGAC 
CCGGCGAGTT 
TGGGCCGCAC 



GTTCGCGGAC 
GGAAGCGGCT 
CTCCGACGCC 
CCAGGACGGT 
CCGGCAGGCC 
CGGCACCGGC 
GCAGGGGCGC 
GGCCGCCGCG 
GCCCCGCACC 
CGAACTCCTC 
CTCCTCCTTC 
GGCCCCCGAA 
CACCCCGCAG 
CGGCGCGGAC 
CGCCGTGCTG 
GGTCTTCGTC 
CACCTACCCC 
GGGCCCGGCC 
CATCACCCCG 
CGGTGTCCTG 
CCAACTGCCG 
GGTGCTGCGG 
CGGGGACGAG 
GACCCGCCAC 
CGCCCGGACC 
CGAATACTGG 
CCCGGGCGCG 
CGTTGCCGCC 
GCTCCACGTC 
CGTCACGCTG 
GGCCGATGTG 
CGCGCTGCCC 
GTGGCTCGGC 
ACTCGCGGCG 
GACGCCGCTC 
CGACGACACG 
GACCCGACAC 
GGCACCCTGG 
CGAGGACATC 
CGGCGACACC 
TTTCACGCTG 
CGACGCACCC 
GGCCGGGGCG 
CATGACCGGC 
CGCGGAAGGC 
CCCGTCCGCG 
TCCGGCGGCC 
CGCCGCCGAG 
CGCGGGTCTG 
GTCCCCGGAC 
GGCCGTCCGG 
CGGCCCGCTG 
GCACGACGTC 
CCGCATCGAC 
GTACACCGGG 
CGGCGTGGAC 
CCCGACGGCC 
CACGGCGGCG 
CACACTGCGC 
CGCCGCACAG 
GCAGCACGTC 
CGCGTTCCGG 
CATCGACGCG 
CGAGCTGCGC 



TTCTCCGAGC 
GACGGCACCG 
GAGCGCAACG 
GCCTCCAACG 
CTGGCCAACG 
ACCAGGCTGG 
GACACCCCTG 
GGCGTCGCCG 
CTGCACGTGG 
ACCGACGCCC 
GGCGTCAGCG 
CCCGCCCCGG 
GCACTCGACG 
CGGGTCGCCG 
CTCGGCGACA 
TACTCGGGGC 
GTGTTCGCCG 
ACGCACTTCG 
CACGCGGTCA 
TCCCTGAGGG 
TCGGGCGGCG 
CCGGGCGTGG 
GAAGCCGTAC 
GCCGGCCACT 
CTGACGTACC 
GCGCACCAGG 
ACGTTCCTCG 
CAGACCGGTA 
CGCGGCGTCG 
CCCACGTATC 
ACCGGCGCGG 
GGCACGGGCG 
GAGCACGCGG 
CGCGCCGGCG 
GTGCTGCCCG 
GGGCGGCGGG 
GCCGGCGGAT 
CCGCCCGCGG 
GGGTACTCCT 
GTGTACGCCG 
CACCCCGCGC 
GGCGGGGCGCJ 
ACGCGGCTGC 
CCGGACGGGC 
TCCGGTGACG 
GACGATCCGC 
ACCCGGGAGC 
GACACCACCT 
GTCCGCTCGG 
ACCTCGGTGG 
GACGGCGTGC 
TCCCTGCCGG 
GCGCTCATAG 
GTCCGCGCGG 
GCCACGGCCA 
GACCTGTCCC 
GTCACCGACC 
TCCGTCCCGA 
GCCGGCGAGA 
ATCGCCCGCC 
CTGCGCGCCG 
ACCGCTTTCC 
TCGCTCGACC 
GACCCGGCCG 



AGGGCGGCCT 
GTTTCGCCGA 
GCCACCGCGT 
GGCTGTCCGC 
CCGGACTCAC 
GCGACCCCAT 
TGCTGCTGGG 
GTGTCATCAA 
ACACGCCGTC 
GGCCCTGGCC 
GCACCAACGC 
GACCCGACAC 
CACAGGTACA 
TCGCGCAGAC 
CGCTCATCAC 
AAAGCACGCT 
AAGCGTGGCG 
CCCACCAGAC 
TCGGCCACTC 
ACGCGGGCGC 
CGATGGTCAC 
AGATCGCCGC 
TCGAAGCCGC 
CCGAGCGCAT 
ACCAGCCCCA 
TCCGCGACCA 
AGATCGGCCC 
CGCCCGACGA 
CGATCGACTG 
CGTTCCAGCA 
GGCAGGAGCA 
GAGTCGTCCT 
TCGACGGCAC 
ACGAGGTCGG 
CGACCGGCGG 
CGGTCACCGT 
TCCTCGGCAC 
AAGCCGGACC 
ACGGACCGGG 
AGGTCGCGCT 
TGCTCGACGC 
CCCGACTGCC 
GGGTCACGGT 
AGCTGGTGGC 
GCCTGCTGCG 
GCGTGGAGGT 
TGACCGCCCG 
TGGTGGTACG 
CGCAGGCGGA 
AGCTGCTCGC 
TCTTCGCGCC 
ACGGCGACTG 
CCGACGACAC 
CCGGACTGAA 
TGGGCGGCGA 
CCGGCGACCG 
GGCGCTGGCT 
TCGTGTTCGC 
AGGTCCTCGT 
ACCTGGGCGC 
CCGGGCTGCC 
CGCGCATGGA 
TGCTGGACGC 
CGATCGTCCC 



GGCCCCCGAC 
GGGGTCCGGC 
GCTGGCGGTC 
GCCGAACGGG 
CCCGGCGGAC 
CGAGGCACAG 
CTCGCTGAAG 
GATGGTCCTC 
CTCGCACGTC 
CGAAACCGAC 
CCACATCATC 
CGGACCGCTG 
CCGCCTGCGC 
ACTCGCCCGG 
CGTGAGCCCG 
GCACCCGCAC 
CGAGGCCCTC 
CGCGCTCACC 
CCTCGGTGAG 
GCTCCTCACC 
CGTCCTGACC 
CGTCAACGGC 
CCGGCAGCTC 
GCAGCCACTC 
CACCGCCATC 
AGTACGTTTC 
CAACCAGGAC 
GGTGCGGGCG 
GACGCTCGTC 
CAAGGACTAC 
GGTGGCGCAC 
GACCGGCCGC 
CGTGCTCCTG 
CTGCGACCTG 
TGTGGCGGTC 
CCACGCGCGG 
GGCACCGGCA 
GGTCGACGTC 
CTTCCGGGGG 
CCCCGACGAG 
CGCGTTCCAG 
GTTCTCGTTC 
CGGCCGCGAC 
CGTGGTCGGT 
CCCGGTCTGG 
CCTCGGCGCC 
CGTCCTCGGC 
GACCGGCACC 
GAACCCCGGC 
CGCGTGCGCC 
GCGGCTGGTC 
GCTGCTCACC 
GCCCCGGCGG 
CTTCCGCGAT 
GGCCGCGGGC 
GGTGTTCGGC 
GGCCCGGATC 
GACCGCGTGG 
CCACGCGGCC 
CGAGCTCTAC 
CGACACGCAC 
CGTCGTCCTG 
CGACGGCCGG 
CGCCTACCTG 



GCGCGC r GCA 
GTCCTGATCG 
GTCCGGGGTT 
CCGTCGCAGG 
GTGGACGCCG 
GCCGTGCTGG 
TCCAACATCG 
GCCATGCGGC 
GACTGGACGG 
CGCCCACGGC 
CTCGAAAGCC 
CCGCTGCTGC 
GCGTTCCTCG 
CGCACCCAGT 
AACGCCGGCC 
ACCGGGCGGC 
GACCACCTCG 
GCGCTCCTGC 
ATCACCGCCG 
ACCCGCACCC 
AGCGAGGAAA 
CCCCACTCCC 
GGCATCCACC 
GTCGCCCCCC 
CCCGGCGACC 
CAGGCGCACA 
CTCTCGCCGC 
CTGCACACCG 
CrCGGCGGGG 
TGGCTGCGGC 
CCGCTGCTCG 
CTGTCGCTGG 
CCCGGCGCGG 
CTGCACGAAC 
TCCGTCGAGA 
GCCGACGGCT 
CCGGCCACGG 
GCCGACGTCT 
CTGCGGHCCG 
CAGAGCGCCG 
GCCGGCGCGC 
CAGGACGTCC 
GGCGAGCGCA 
GCCGTGCTGT 
ACCGAGCTGC 
GACCCGGGCG 
GCGCTCCAGC 
GGCCCGGCCG 
CGCGTCGTGC 
GCGCTGGACG 
CGGATGTCCG 
CGGTCCGCCT 
GCGCTCGAAG 
GTGCTGATCG 
GTCGTGGTGG 
CTGACCCGGG 
CCCGACGGCT 
TACGGCCTGG 
ACCGGCGGTG 
GCCACCGCCA 
ATCGCCGACT 
AACGCGCTGA 
TTCGTCGAGA 
CCGTTCGACC 
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TGCTGGACGC 

ACGCGGGCGC 

CGCTCGGCTG 

CGCTCGACCC 

TCGCCCGCCA 

GGACGCCCGG 

TGGAGCGGGT 

GCACCGTCGC 

GCGCCTGGTA 

CGTCGGCCGC 

TCCTCGACGC 

GGGGGCTCTG 

GGATGCGGCG 

CGGCCGGCCG 

TGCCGCTGCT 

CGTCCGCCGA 

TCGTCCGGGA 

CGGCGGCGTT 

TCACCGAGGC 

ACGTGCTCGC 

GGACCGCGGC 

GGCTGCCCGG 

ACGCCATCAC 

ACCCCGACGC 

GCTTCGACGC 

AGCGGGTGCT 

CGACCCGCGG 

GTGCGGACAC 

TGTCGTACTT 

CGCTGGTGGC 

TGGTCGGCGG 

GCGGCCTCGC 

TCGCCGAGGG 

ACACCGTCCT 

TGTCGGCGCC 

GGCTCACCCC 

ACCCCATCGA 

TGCTGGGCTC 

TCATCAAGAT 

AGCCGTCGCC 

CGTGGCCCGA 

CCAACGCCCA 

CCGGTGACCT 

GCCGACTGCG 

CGCTGGCCCG 

CCACACCCCC 

AGCATCCCGC 

ATGAAGCGCT 

TGCTCTTCGC 

ACGCGGTCAT 

CGCTGGACGA 

CACCCGGTGC 

CGGGCGTGGA 

ACGCCGTGCT 

CCGGGCACTC 

TCCGCTACCA 

CCGAGCAGGT 

TGTTCGTGGA 

AGAACGGCAC 

GCGGTGCCAC 

TGCCCGCGTA 

CCGACGCGGG 

TGTTCACGGG 

TGGCCGCCGC 



GGGCGGCGAC 
GCTGGAGCCG 
GATGAGCCGC 
GGAGGGCGCC 
CCTGCGCGAA 
CGTCCACCTG 
GGACCGGCCG 
GTCGCTCACC 
CCTGCACGAG 
CGGCGTGCTC 
GCTCGCCGAG 
GGAGGACGTG 
CAGCGGTTTC 
CACCGGAAGT 
GCGCGGCCTG 
CCGGCTCGCC 
GAGCACCGCC 
CAAGGACCTC 
GACCGGTGTG 
CGGGAAGCTC 
CACGGCCGGT 
CGGGGTCGCG 
GGAGTTCCCG 
GATCGGCAAG 
GGCGTTCTTC 
CCTGGAGACG 
CAGCGACACC 
CGACGGCTTC 
CTACGGTCTG 
GCTGCACCAG 
CGTCACGGTG 
GCCGGACGGC 
TGCCGGTGTG 
GGCGGTCGTC 
GAACGGGCCG 
GGCGGACGTG 
GGCACAGGCG 
GCTGAAGTCC 
GGTGCAGGCC 
GCACGTCGAC 
GACCGACCGG 
CGTCATCCTG 
TCCCCTGCTG 
CGCCTACCTG 
GCGCACACAC 
CGCGGACCGG 
GATGGGCGAG 
CCGCCGCCTT 
CCACCAGGCG 
CGGCCACTCG 
CGCGTGCACC 
CATGGTCACC 
GATCGCCGCC 
CACCGTCGCC 
CGCGCACATG 
CCCTCCCCAC 
CCGCAAGCCC 
GATCGGCCCC 
CGCGGACGAG 
GCTCGACTGG 
CGCGTTCCAA 
CCACCCCGTG 
TTCCGTGCCG 
GGACGCGGTC 



CGCATCGGCG 

CTGCCGGTCC 

GCCCGCCACA 

GTCGTCCTCA 

CGGCATGTCT 

CCCTGCGACG 

ATCACCGCCG 

CCCGAGCGTT 

CTGACGAAGG 

GGCAACGCCG 

CTGCGCCACG 

AGCGGGCTCA 

CGGGCCATCA 

CCCGTGGTGG 

CGGCGGACGA 

GCGCTGACCG 

GCCGTGCTCG 

GGCATCGACT 

CGGCTGAACG 

GGCGACGAAC 

GCGCACGACG 

TCACCCGAGG 

ACGGACCGCG 

ACCTTCGTCC 

GGCATCAGCC 

TCGTGGGAGG 

GGCGTGTTCG 

GGCGCGACCG 

GAGGGTCCGG 

GCCGGGCAGT 

ATGGCGTCTC 

CGGGCGAAGG 

CTGATCGTCG 

CGTGGTTCGG 

TCGCAGGAGC 

GACGCCGTCG 

GTACTGGCCA 

AACATCGGCC 

CTCCGGCACG 

TGGACGGCCG 

CCACGGCGTG 

GAGGCCGGAC 

GTGTCGGCAC 

GACACCACCC 

TTCGCCCACC 

CCCGACGAAC 

CAGCTCGCCG 

GACAACCCCG 

GCGTTCACCG 

CTGGGCGAGA 

CTGATCACCA 

GTACTGACCA 

GTCAACGGGC 

GGGCAGCTCG 

GAGCCCGTGG 

ACCTCCATTC 

GTGCTGTTCC 

GCCCAGGhCC 

GTGCACGCGC 

CCCCGCATCC 

CGGCGGCACT 

CTGGGCTCCG 

ACCGGTGCGG 

GACTGCGCCA 



AGATCCTGGG 

GTGCCTGGGA 

TCGGCAAGAA 

CCGGCGGCTC 

ACCTGCTGTC 

TCGGTGACCG 

TGGTGCACCT 

TCGACACGGT 

AGCAGGACCT 

GCCAGGGCAA 

GTTCCGGGCT 

CCGCGGCGCT 

CCGCGCAACA 

TCGCGGCGGC 

CCGTCCGGCG 

GCGACGAGCT 

GCCACGTGGG 

CGCTCACCGC 

CCACGGCGGT 

TGACCGGCAC 

AGCCGCTGGC 

AGCTGTGGCA 

GCTGGGACGT 

GGCACGGTGG 

CGCGCGAGGC 

CGTTCGAAAG 

TCGGCGCCTT 

GCTCGCAGAC 

CGGTCACGGT 

CGCTGCGCTC 

CCGGCGGCTT 

CGTTCGGCGC 

AGAGGCTCTC 

CGGTCAACCA 

GGGTGATCCG 

AGGCCCACGG 

CCTACGGACA 

ACGCCCAGGC 

GGGAGCTGCC 

GCGCCGTCGA 

CCGCCGTCTC 

CGGTAACGGA 

GCTCACCGGA 

CGGACGTCGA 

GCGCCGTGCT 

TCGTCTTCGT 

CCGCCCATCC 

ACCCCCACGA 

CCCTCCTGCG 

TCACCGCGGC 

CGCGCGCCCG 

GCGAAGAGAA 

CCCACTCCAT 

GCATCCACCA 

CCGCCGAGCT 

CGAACGACCC 

ACGCCCACGC 

TCTCCCCGCT 

TGCACACCGC 

TCGGGGCTGG 

ACTGGATCGA 

GTATCGCCCT 

ACCGCGCGGT 

CGGTCGAGCG 



CGAACTGCTC 
CGTCCGGCAG 
CGTCCTGACG 
CGGCACGCTC 
CCGGACGGCA 
GG AC CAGCTG 
CGCCGGTGCG 
GCTGCGCCCG 
CGCCGCGTTC 
CTACGTCGCC 
GCCGGCCCTC 
CGGCGAAGCC 
GGGCATGCAC 
GCTCGACGAC 
GGCCGCCGTC 
CGCCGAAGCG 
TGGCGAGGAC 
GGTCCAGCTG 
CTTCGACTTC 
CCGCGCGCCC 
GATCGTGGGA 
CCTCGTGGCA 
CGACGCGATC 
CTTCCTCACC 
CCTCGCGATG 
CGCCGGCATC 
CTCCTACGGT 
CAGTGTGCTC 
CGACACGGCG 
CGGCGAATGC 
CGTGGAGTTC 
GGGTGCGGAC 
CGACGCCGAA 
GGATGGTGCC 
GCAGGCCCTG 
CACCGGCACC 
GGAGCGCGCC 
CGCGTCCGGC 
GCCGACGCTG 
ACTGCTGACG 
CTCGTTCGGG 
GACGCCCGCG 
AGCGCTCGAC 
CCGGGTGGCC 
GCTCGGTGAC 
CTACTCCGGC 
CGTGTTCGCC 
CCCCACGCAC 
GTCCTGGGGC 
GCACGCCGCC 
CCTCATGCAC 
GGChCGCGAG 
CGTGCTGTCC 
CCGCCTGCCC 
GCTCGCCACC 
CACCACCGCT 
GCAGCAGTAC 
CGTCGACGGG 
GCTCGCGCAC 
GTCACGGCAC 
GTCGGCACGC 
CGCCGGGTCG 
GTTCGTCGCC 
GCTCGACATC 



CGGCTGTTCG 
GCACGCf!ACG 
CTGCCCCGGC 
GCCGGCATCC 
CCGCCCGAGG 
GCGGCGGCCC 
CTGGACGACG 
AAGGCCGACG 
GTGCTCTACT 
GCGAACGCGT 
TCCATCGCCT 
GACCGGGACC 
CTGTACGAGG 
GCGCCGGACG 
CGGGAGTGTT 
CTGCTGACGC 
ATCCCCGCGA 
CGCAACGCCC 
CCGACCCCGC 
GTCGTGCCCC 
ATGGCC^GCC 
TCCGGCACCG 
TACGACCCGG 
GGCGCGACAG 
GACCCGCAGC 
ACCCCGGACT 
TACGGCACCG 
TCCGGCCGGC 
TGTTCGTCGT 
TCGCTCGCCC 
TCCCGGCAGC 
GGCACGAGCT 
CGCAACGGTC 
TCCAACGGGC 
GCCAACGCCG 
AGGCTGGGCG 
ACCCCCCTGC 
GTCGCCGGCA 
CACGCCGAGG 
TCGGCCPGGC 
GTGAGCGGCA 
GCATCGCCTT 
GAGCAGATCC 
GTGGCACAGA 
ACCGTCATCA 
CAGGGCACCC 
GACGCCTGGC 
AGCCAGCATG 
ATCACCCCGC 
GGCATCCTGT 
ACGCTCCCGC 
GCGTTGCGGC 
GGGGACGAGG 
GCCCCGCACG 
ACCCGCGGGC 
GAGTACTGGG 
CCGGACGCCG 
ATCCCGCTGC 
CTCTACGCGC 
GACGCGGATG 
CCGGCCGCAT 
CCGGGCCGGG 
GAGCTGGCGC 
GCCTCCGTGC 
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60 601 CCGGCCGGCC GGGCCATGGC 
60661 ACGGCCGGCG CCGGTTCACC 
607 21 CCGAGGGGGT GCTGCGCCCC 
60781 CCCCACCGGG CGCGGTGCCC 
5 608 41 TCTTCGCCGA GGCCGAGGTG 
60901 ACGCGGTCTT CTCCGCGGTC 
60 961 CGGTGCACGC GTCGGACGCC 
61021 CCATGGGATT CGCCGCCTTC 
61081 CGCTGCGGGA GGTGGCGTCA 

10 6114 1 AGTGGCTCGC GGTCGCCGAG 
61201 TCACCGCCGC CCACCCCGAC 
61261 CCCGCGTCCT GACCGCCCTG 
61321 ACACCACCAC CGACCCCGCC 
61381 AACACCCCCA CCGCATCCGC 

15 61441 CCCAACTCGC CACCCTCGAC 
61501 CCCACCTCAC CCCCCTCCAC 
61561 ACGCCATCAT CATCACCGGC 
61621 ACCACCCCCA CACCTACCTC 
61681 ACCTCCCCTG CGACGTCGGC 

20 6174 1 AACCCCTCAC CGCCATCTTC 
61801 TCACCCCCGA CCGCCTCACC 
61861 ACCACCTCAC CCAAAACCAA 
61921 TCCTCGGCAG CCCCGGACAA 
61981 CCACCCACCG CCACACCCTC 

25 6204 1 CCACCAGCAC CCTCACCGGA 
62101 GTTTCCTCCC GATCACGGAC 
62161 GCGAGGACTT CGTCATGGCC 
62221 CGCCCATCCT GAGCGGCCTG 
6228 \ TCGCCCAGCG GCTCGCCGAG 

30 6234 1 TCTCGGACGC CACGGCCGCC 
624 01 CGACGTTCAA GGACCTCGGC 
624 61 CGGAGGCGAC CGGGCTGCGG 
62521 TCCTCGCCGC CAAGCTCCGC 
6258 1 CGGCACGGAC CCACCACGAC 

35 6264 1 GCGGGGTCGC CTCGCCGGAG 
62701 CCGAGTTCCC CACCGACCGC 
627 61 CCCCCGGCAA GACCTACGTC 
62821 CCGCGTTCTT CGGCATCAGC 
62881 TCCTCGAAAC CTCCTGGGAG 

40 62 94 1 GCAGCGACAC CGGCGTGTTC 
63001 TGGGCGGGTT CGGCGCCACC 
63061 TCTTCGGCAT GGAGGGCCCG 
63121 CCCTGCACCA GGCGGCACAG 
63181 GTGTCACGGT GATGCCCACC 

45 632 41 CCCCCGACGG CCGTTGCCAG 
63301 GCGCCGGCGT TCTTGTGCTG 
63361 TCGCGGTCGT CCGCTCCTCC 
6342 1 CCAACGGCCC CTCCCAGCAG 
6 3481 CCGCCGACGT GGACGTGGTG 
■50 6354 1 AGGCACAGGC CATCATCGCG 
63601 CGGTCAAGTC GAACATCGGA 
63661 TGGTCATGGC GATGCGCCAC 
63721 CGCATGTGGA CTGGACCGAG 
63781 ACGCGGGACG CCCGCGCCGC 

55 638 41 ACGTGATCCT TGAGGGTGTT 
63901 TGCCGTTGCC GGTGTCGGCT 
63961 AGGGGTATCT GCGCGGGAGT 
64 021 GTGCTGTCTT CGGTCACCGT 
64 081 TGGATCAGCC GCGTACGGTG 

60 64141 GTGTGGAGTT GATGGACCGT 
64 201 CGTTGTTGCC GCACACGGGC 
64 2 61 AGCGGGTGGA GGTGGTCCAG 
64 321 GGCAGGCCCA CGGGGTCGTA 
64 381 CGGCGTGCGT GGCCGGGGCC 
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CGGACGACCG TACAGACCTG GGTCGACGAG CCGGCGGACG 
GTGCACACCC GCACCGGCGA CGCCCCGTGG ACGCTGCACG 
CATGGCACGG CCCTGCCCGA TGCGGCCGAC GCCGAG JGGC 
GCGGACGGGC TGCCGGGTGT GTGGCGCCGG GGGGACCAGG 
GACGGACCGG ACGGTTTCGT GGTGCACCCC GACCTGCTCG 
GGCGACGGAA GCCGCCAGCC GGCCGGATGG CGCGACCTGA 
ACCGTACTGC GCGCCTGCCT t CACCCGGCGC ACCGACGGAG 
GACGGCGCCG GCCTGCCGGT ACTCACCGCG GAGGCGGTGA 
CCGTCCGGCT CCGAGGAGTC GGACGGCCTG CACCGGTTGG 
GCGGTCTACG ACGGTGACCT GCCCGAGGGA CATGTCCTGA 
GACCCCGAGG ACATACCCAC CCGCGCCCAC ACCCGCGCCA 
CAACACCACC TCACCACCAC CGACCACACC CTCATCGTCC 
GGCGCCACCG TCACCGGCCT CACCCGCACC GCCCAGAACG 
CTCATCGAAA CCGACCACCC CCACACCCCC CTCCCCCTGG 
CACCCGCACC TCCGCCTCAC CCACCACACC CTCCACCACC 
ACCACCACCC CACCCACCAC CACCCCCCTC AACCCCGAAC 
GGCTCCGGCA CCCTCGCCGG CATCCTCGCC CGCCACCTGA 
CTCTCCCGCA CCCCACCCCC CGACGCCACC CCCGGCACCC 
GACCCCCACC AACTCGCCAC CACCCTCACC CACATCCCCC 
CACACCGCCG CCACCCTCGA CGACGGCATC CTCCACGCCC 
ACCGTCCTCC ACCCCAAAGC CAACGCCGCC TGGCACCTGC 
CCCCTCACCC ACTTCGTCCT CTACTCCAGC GCCGCCJCCG 
GGAAACTACG CCGCCGCCAA CGCCTTCCTC GACGCCCTCG 
GGCCAACCCG CCACCTCCAT CGCCTGGGGC ATGTGGCACA 
CAACTCGACG ACGCCGACCG GGACCGCATC CGCCGCGGCG 
GACGAGGGCA TGCGCCTCTA CGAGGCGGCC GTCGGCTCCG 
GCCGCGATGG ACCCGGCACA GCCGATGACC GGCTCCGTAC 
CGCAGGAGCG CGCGGCGCGT CGCCCGTGCC GGGCAGACGT 
CTGCCCGACG CCGACCGCGG CGCGGCGCTG ACCACCCTCG 
GTGCTCGGCC ACGCCGACGC CTCCGAGATC GCGCCGACCA 
ATCGACTCGC TCACCGCGAT CGAGCTGCGC AACCGGCTCG 
CTGAGTGCCA CGCTGGTGTT CGACCACCCG ACACCTCGGG 
ACCGATCTGT TCGGCACGGC CGTGCCCACG CCCGCGCGGA 
GAGCCACTCG CGATCGTCGG CATGGCGTGC CGACTGCCCG 
GACCTGTGGC AGCTCGTGGC GTCCGGCACC GACGCGATCA 
GGCTGGGACA TCGACCGGCT GTTCGACCCG GACCCGGACG 
CGGCACGGCG GCTTCCTCGC CGAGGCCGCC GGCTTCGATG 
CCGCGCGAGG CACGGGCCAT GGACCCGCAG CAGCGCGTCA 
GCGTTCGAGA ACGCGGGCAT CGTGCCGGAC ACGCTGCGCG 
ATGGGCGCGT TCTCCCATGG GTACGGCGCC GGCGTCGACC 
GCCACGCAGA ACAGCGTGCT CTCCGGCCGG TTGTCGVACT 
GCCGTCACCG TCGACACCGC CTGCTCGTCG TCGCTGGTCG 
GCGCTGCGGA CTGGAGAATG CTCGCTGGCG CTCGCCGGCG 
CCGCTGGGCT ACGTCGAGTT CTGCCGCCAG CGGGGACTCG 
GCCTTCGCGG AAGGCGCCGA CGGCACGAGC TTCTCGGAGG 
GAGCGGCTCT CCGACGCCGA GCGCAACGGA CACACCGTCC 
GCCGTCAACC AGGACGGCGC CTCCAACGGC ATCTCCGCAC 
CGCGTCATCC GCCAGGCCCT CGACAAGGCC GGGCTCGCCC 
GAGGCCCACG GCACCGGAAC CCCGCTGGGC GACCCGATCG 
ACCTACGGCC AGGACCGCGA CACACCGCTC TACCTCGGTT 
CACACCCAGA CCACCGCCGG TGTCGCCGGC GTCATCAAGA 
GGCATCGCGC CGAAGACACT GCACGTGGAC GAGCCGTCGT 
GGTGCGGTGG AACTGCTCAC CGAGGCGAGG CCGTGGCCCG 
GCGGGCGTGT CGTCGCTCGG TATCAGCGGT ACGAACGCCC 
CCCGGGCCGT CGCGTGTGGA GCCGTCTGTT GACGGGTTGG 
CGGAGTGAGG CGAGTCTGCG GGGGCAGGTG GAGCGGCTGG 
GTGGATGTGG CCGCGGTCGC GCAGGGGTTG GTGCGTGAGC 
GCGGTACTGC TGGGTGATGC CCGGGTGATG GGTGTGGCGG 
TTCGTCTTTC CCGGGCAGGG TGCTCAGTGG GTGGGCATGG 
TCTGCGGTGT TCGCGGCTCG TATGGAGGAG TGTGCGCGGG 
TGGGATGTGC GGGAGATGTT GGCGCGGCCG GATGTGGCGG 
CCGGCCAGCT GGGCGGTCGC GGTCAGCCTG GCCGCACTGT 
CCCGACGCGG TGATCGGACA CTCCCAGGGC GAGATCGCGG 
CTCAGCCTTG AGGACGCCGC CCGCGTGGTG GCCTTGCGCA 
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6444 1 GCCAGGTCAT CGCGGCGCGA CTGGCCGGGC GGGGAGCGAT GGCTTCGGTG GCATTGCCGG 
64 501 CCGGTGAGGT CGGTCTGGTC GAGGGCGTGT GGATCGCGGC GCGTAACGGC CCCGCCTCGA 
64 561 CAGTCGTGGC CGGCGAGCCG TCGGCGGTGG AGGACGTGGT GACGCGGTAT GAGACCGAAG 
64 621 GCGTGCGAGT GCGTCGTATC GCCGTCGACT ACGCCTCCCA CACGCCCCAC GTGGAAOCCA 
5 64 681 TCGAGGACGA ACTCGCTGAG GTACTGAAGG GAGTTGCAGG GAAGGCCGCG TCGGTGGCGT 
64741 GGTGGTCGAC CGTGGACAGC GCCTGGGTGA CCGAGCCGGT GGATGAGAGT TACTGGTACC 
64 801 GGAACCTGCG TCGCCCCGTC GCGCTGGACG CGGCGGTGGC GGAGCTGGAC GGGTCCGTGT 
64 8 61 TCGTGGAGTG CAGCGCCCAT CCGGTGCTGC TGCCGGCGAT GGAACAGGCC CACACGGTGG 

64 921 CGTCGTTGCG CACCGGTGAC GGCGGCTGGG AGCGATGGCT GACGGCGTTG GCGCAGGCGT 
10 64 981 GGACCCTGGG CGCGGCAGTG GACTGGGACA CGGTGGTCGA ACCGGTGCCA GGGCGGCTGC 

6504 1 TCGATCTGCC CACCTACGCG TTCGAGCGCC GGCGCTACTG GCTGGAAGCG GCCGGTGCCA 
65101 CCGACCTGTC CGCGGCCGGG CTGACAGGGG CAGCACATCC CATGCTGGCC GCCATCACGG 
65161 CACTACCCGC CGACGACGGT GGTGTTGTTC TCACCGGCCG GATCTCGTTG CGCACGCATC 
65221 CCTGGCTGGC TGATCACGCG GTGCGGGGCA CGGTCCTGCT GCCGGGCACG GCCTTTGTGG 
15 65281 AGCTGGTCAT CCGGGCCGGT GACGAGACCG GTTGCGGGAT AGTGGATGAA CTGGTCATCG 
6534 1 AATCCCCCCT CGTGGTGCCG GCGACCGCAG CCGTGGATCT GTCGGTGACC GTGGAAGGAG 
654 01 CTGACGAGGC CGGACGGCGG CGAGTGACCG TCCACGCCCG CACCGAAGGC ACCGGCAGCT 
654 61 GGACCCGGCA CGCCAGCGGC ACCCTGACCC CCGACACCCC CGACACCCCC AACGCTTCCG 

65 521 GTGTTGTCGG TGCGGAGCCG TTCTCGCAGT GGCCACCTGC CACTGCCGCG GCCGTCGACA 
20 65581 CCTCGGAGTT CTACTTGCGC CTGGACGCGC TGGGCTACCG GTTCGGACCC ATGTTCCGCG 

6 5641 GAATGCGGGC TGCCTGGCGT GATGGTGACA CCGTGTACGC CGAGGTCGCG CTCCCCGAGG 
657 01 ACCGTGCCGC CGACGCGGAC GGTTTCGGCA TGCACCCGGC GCTGCTCGAC GCGGCCTTGC 
657 61 AGAGCGGCAG CCTGCTCATG CTGGAATCGG ACGGCGAGCA GAGCGTGCAA CTGCCG^TCT 
65821 CCTGGCACGG CGTCCGGTTC CACGCGACGG GCGCGACCAT GCTGCGGGTG GCGGTCGTAC 

25 65881 CGGGCCCGGA CGGCCTCCGG CTGCATGCCG CGGACAGCGG GAACCGTCCC GTCGCGACGA 
6594 1 TCGACGCGCT CGTGACCCGG TCCCCGGAAG CGGACCTCGC GCCCGCCGAT CCGATGCTGC 
66001 GGGTCGGGTG GGCCCCGGTG CCCGTACCTG CCGGGGCCGG TCCGTCCGAC GCGGACGTGC 
66061 TGACGCTGCG CGGCGACGAC GCCGACCCGC TCGGGGAGAC CCGGGACCTG ACCACCCGTG 
66121 TTCTCGACGC GCTGCTCCGG GCCGACCGGC CGGTGATCTT CCAGGTGACC GGTGGCCTCG 

30 66181 CCGCCAAGGC GGCCGCAGGC CTGGTCCGCA CCGCTCAGAA CGAGCAGCCC GGCCGCTTCT 
6624 1 TCCTCGTCGA AACGGACCCG GGAGAGGTCC TGGACGGCGC GAAGCGCGAC GCGATCGCGG 
66301 CACTCGGCGA GCCCCATGTG CGGCTGCGCG ACGGCCTCTT CGAGGCAGCC CGGCTGATGC 
66361 GGGCCACGCC GTCCCTGACG CTCCCGGACA CCGGGTCGTG GCAGCTGCGG CCGTCCGCCA 
66421 CCGGTTCCCT CGACGACCTT GCCGTCGTCC CCACCGACGC CCCGGACCGG CCGCTCGCGG 

35 66481 CCGGCGAGGT GCGGATCGCG GTACGCGCGG CGGGCCTGAA CTTCCGGGAT GTCACGGTCG 
6654 1 CGCTCGGTGT GGTCGCCGAT GCGCGTCCGC TCGGCAGCGA GGCCGCGGGT GTCGTCCTGG 
66601 AGACCGGCCC CGGTGTGCAC GACCTGGCGC CCGGCGACCG GGTCCTGGGG ATGCTCGCGG 
66661 GCGCCTTCGG ACCGGTCGCG ATCACCGACC GGCGGCTGCT CGGCCGGATG CCGGACGGCT 
66721 GGACGTTCCC GCAGGCGGCG TCCGTGATGA CCGCGTTCGC GACCGCGTGG TACGGCCTGG 

40 66781 TCGACCTGGC CGGGCTGCGC CCCGGCGAGA AGGTCCTGAT CCACGCGGCG GCGACCGGTG 
6684 1 TCGGCGCGGC GGCCGTCCAG ATCGCGCGGC ATCTGGGCGC GGAGGTGTAC GCGACCACCA 
66901 GCGCCGCGAA GCGCCATCTG GTGGACCTGG ACGGAGCGCA TCTGGCCGAT TCCCGCAGCA 
66961 CCGCGTTCGC CGACGCGTTC CCGCCGGTCG ATGTCGTGCT CAACTCGCTC ACCGGTGAAT 
67 021 TCCTCGACGC GTCCGTCGGC CTGCTCGCGG CGGGTGGCCG GTTCATCGAG ATGGGGAAGA 

45 67081 CGGACATCCG GCACGCCGTC CAGCAGCCGT TCGACCTGAT GGACGCCGGC CCCGACCGGA 
67141 TGCAGCGGAT CATCGTCGAG CTGCTCGGCC TGTTCGCGCG CGACGTGCTG CACCCGCTGC 
67201 CGGTCCACGC CTGGGACGTG CGGCAGGCGC GGGAGGCGTT CGGCTGGATG AGCAGCGGGC 
67 2 61 GTCACACCGG CAAGCTGGTG CTGACGGTCC CGCGGCCGCT GGATCCCGAG GGGGCCGTCG 
67 321 TCATCACCGG CGGCTCCGGC ACCCTCGCCG GCATCCTCGC CCGCCACCTG GGCCACCCCC 

50 67 381 ACACCTACCT GCTCTCCCGC ACCCCACCCC CCGACACCAC CCCCGGCACC CACCTCCCCT 
674 4 1 GCGACGTCGG CGACCCCCAC CAACTCGCCA CCACCCTCGC CCGCATCCCC CAACCCCTCA 
67 501 CCGCCGTCTT CCACACCGCC GGAACCCTCG ACGACGCCCT GCTCGACAAC CTCACCCCCG 
67 5 61 AGCGCGTCGA CACCGTCCTC AAACCCAAGG CCGACGCCGC CTGGCACCTG CACCGGCTCA 
67 621 CCCGCGACAC CGACCTCGCC GCGTTCGTCG TCTACTCCGC GGTCGCCGGC CTCATGGGCA 

55 67681 GCCCGGGGCA GGGCAACTAC GTCGCGGCGA ACGCGTTCCT CGACGCGCTC GCCGAACACC 
67741 GCCGTGCGCA AGGGCTGCCC GCGCAGTCCC TCGCATGGGG CATGTGGGCG GACGT CAGCG 
67801 CGCTCACCGC GAAACTCACC GACGCGGACC GCCAGCGCAT CCGGCGCAGC GGATTCCCGC 
67 8 61 CGTTGAGCGC CGCGGACGGC ATGCGGCTGT TCGACGCGGC GACGCGTACC CCGGAACCGG 
67 921 TCGTCGTCGC GACGACCGTC GACCTCACCC AGCTCGACGG CGCCGTCGCG CCGTT-GCTCC 

60 67 981 GCGGTCTGGC CGCGCACCGG GCCGGGCCGG CGCGCACGGT CGCCCGCAAC GCCGGCGAAG 
6804 1 AGCCCCTGGC CGTGCGTCTT GCCGGGCGTA CCGCCGCCGA GCAGCGGCGC ATCATG^AGG 
68101 AGGTCGTGCT CCGCCACGCG GCCGCGGTCC TCGCGTACGG GCTGGGCGAC CGCGTGGCGG 
68161 CGGACCGTCC GTTCCGCGAG CTCGGTTTCG ATTCGCTGAC CGCGGTCGAC CTGCGCAATC 
68221 GGCTCGCGGC CGAGACGGGG CTGCGGCTGC CGACGACGCT GGTGTTCAGC CACCCGACGG 
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682 81 CGGAGGCGCT CACCGCCCAC 
6834 1 GGGAGTCCCT GCCCGCGGTG 
684 01 CGATCGCCAT CGTGGCGATG 
684 61 TGTGGCGGCT CGTCGAGTCC 
5 68 521 GGGACGTCGA CGCGCTGTAC 
68 581 GGGGCGGTTA CCTGGCCGGG 
68 64 1 GCGAAGCGCT CGGCATGGAC 
68 701 TCGAGCGCGG CCGGATCAGT 
687 61 GTGCGGCCGC GCAGGGCTAC 

10 68 821 GTGGTTCCAC GAGCCTGCTG 
68881 CGGTCACCGT GGACACGGCG 
68941 GGCTGCGCCT GGGCGAGTGC 
69001 CGGCCGCGTT CGTGGAGTTC 
69061 CGTTCGGCGC GGGCGCGGAC 

15 69121 AACGGCTCTC CGACGCCGAG 

6 9181 CCGTCACGTC CGACGGCGCC 
69241 GGGTCATCCG GAAGGCGCTC 
69301 AGGGGCACGG CACCGGCACC 
69361 CGTACGGGCA GGACCGTCCG 

20 694 21 ATGCC ACGGC CGCGGCCGGT 
694 81 GCACGATGCC GCGGACGCTG 
69543 GACAGGTGTC CCTGCTCGGC 
69601 CGGCCGTCTC CGCGTTCGGG 
69661 GTCCGGCGCC CGTGGCGTCC 

25 69721 CGTGGGTGCT CTCCGCGCGG 
69781 ACCACCTCGC GGCGGCACCG 
69841 GCCGCGCCCA GTTCGCCCAC 
69901 CCGCGCTCGA CGGCCTCGCG 
69961 AGGAGCGGCG CGTCGCCTTC 

30 7 0021 GCGAGCTCCA CCGCCGGTTC 

7 008 i TCGGCAAGCA CCTCAAGCAC 
7 0141 CCCATGACAC CCTGTACGCC 
7 0201 TGCTGGAGCA CTGGGGGGTG 
7 0261 CCGCGGCGTA CGCGGCGGGG 

35 70321 GGGGGCGGGC GCTGCGGGCG 
7 0 381 CGGAGGTCGG CGCCCGCACG 
704 4 1 TGCTCGCCGG TTCGCCGGAC 
7 0501 GGCGCACGAA ACGGCTCGAC 
7 0561 TCGACGGCTT CCGTACGGTG 

40 7 0 621 TGTCCACGAC GACGGGCCGG 
7 0681 GCCATGCGCG TCGGCCGGTG 
7 0741 TCACCACGTT CGTGGCCGTC 
7 0801 CCGGGGAGGA CGCCGGGACC 
7 0861 CGGCGCTGAC CGCCCTCGCC 

45 7 0921 TAC1 GGCCGG TGGCCGGCCA 
7 0981 GGCTGGCCCC GGCCGTGGCG 
71041 AGTCCGAGCC GGAGGACCTC 
71101 TCGGCGTCAC GGACCCCGCC 
71161 ACTCACTGGC GGTGCAGCGG 

50 7122i CGGCGGCCGT CCTGTTCGAC 
7 1281 GGATCGAGGC CGGCCAGGAC 
71341 TCTCGCTCCT GGAGGAGATG 
714 01 CGGAGCGTGC GGCCATCGCC 
71461 GATGAGCACC GATACGCACG 

55 71521 GGACGGTCAC CGCGCCATCC 
71581 CAAGCACTGG CTGGTCGCCG 
7 1641 CAGCTCGGCC GCGCCGTCCG 
7 1701 GGACTCACCG GAGCACAACC 
7 1761 GGCGCGCAAG CGGGAGGACT 

60 7 1821 GGCCGCGGGA CCCGGCACCG 
7 1881 CATCAACGCG CTGTACGGGC 
71941 CGACATCACC GGCTCGGCCG 
7 2001 GCACGCGGTG CGGCTGGTCC 
7 2061 GCTGGCCTCG GCCGACGACG 
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CTGCTCGACC TGATCGACGC TCCCACCGCC CGGATCGCCG 
ACGGCCGCTC CCGTGGCGGC CGCGCGGGAC CAGGACGAGC 
GCGTGCCGGC TGCCCGGTGG TGTGACGTCG CCCGAGGACC 
GGCACCGACG CGATCACCAC GCCTCCTGAC GACCGCGGCT 
GACGCGGACC CGGACGCGGC CGGCAAGGCG TACAACCTGC 
GCGGCGGAGT TCGACGCGGC GTTCTTCGAC ATCAGTCCGC 
CCGCAGCAAC GCCTGCTGCT CGAAACGGCG TGGGAGGCGA 
CCGGCGTCGC TCCGCGGCCG GGAGGTCGGC GTCTATGTCG 
GGGCTGGGCG CCGAGGACAC CGAGGGCCAC GCGATCACCG 
TCCGGACGGC TGGCGTACGT GCTCGGGCTG GAGGGCCCGG 
TGCTCGTCGT CTCTGGTCGC GCTGCATCTG GCGTGCCAGG 
GAACTCGCTC TGGCCGGAGG GGTCTCCGTA CTGAGTTCGC 
TCCCGCCAGC GCGGGCTCGC GGCCGACGGG CGCTGCAAGT 
GGCACGACGT GGTCCGAGGG CGTGGGCGTG CTCGTACTGG 
CGGCTCGGGC ACACCGTGCT CGCCGTCGTC CGCGGChGCG 
TCCAACGGCC TCACCGCGCC GAACGGGCTC TCGCAGCAGC 
GCCGCGGCCG GGCTGACCGG CGCCGACGTG GACGTCGTCG 
CGGCTCGGCG ACCCGGTCGA GGCGGACGCG CTGCTCGCGA 
GCACCGGTCT GGCTGGGCTC GCTGAAGTCG AACATCGGAC 
GTCGCGGGCG TCATCAAGAT GGTGCAGGCG ATCGGCGCGG 
CATGTGGAGG AGCCCTCGCC CGCCGTCGAC TGGAGCACCG 
TCCAACCGGC CCTGGCCGGA CGACGAGCGT CCGCGCCGGG 
CTCAGCGGGA CGAACGCGCA CGTCATCCTG GAACAGCACC 
CAGCCGCCCC GGCCGCCCCG TGAGGAGTCC CAGCCGCTGC 
ACTCCGGCCG CGCTGCGGGC CCAGGCGGCC CGGCTGCGCG 
GACGCGGATC CGTTGGACAT CGGGTACGCG CTGGCCACCA 
CGTGCCGCGG TCGTCGCCAC CACCCCGGAC GGATTCCGTG 
GACGGCGCGG AGGCGCCCGG AGTCGTCACC GGGACCGCTC 
CTCTTCGACG GCCAGGGCGC CCAGCGCGCC GGAATGGGGC 
CCCGTCTTCG CCGCCGCGTG GGACGAGGTC TCCGACGCGT 
TCCCCCACGG ACGTCTACCA CGGCGAACAC GGCGCTCTCG 
CAGGCCGGCC TGTTCACGCT CGAAGTGGCG CTGCTGCGGC 
CGGCCGGACG TGCTCGTCGG GCACTCCGTC GGCGAGGTGA 
GTGCTCACCC TGGCGGACGC GACGGAGTTG ATCGTGGCCC 
CTGCCGCCCG GGGCGATGCT CGCCGTCGAC GGAAGCCCGG 
GATCTGGACA TCGCCGCGGT CAACGGCCCG TCCGCCGTGG 
GATGTGGCGG CGTTCGAACG GGAGTGGTCG GCGGCCGGGC 
GTCGGGCACG CGTTCCACTC CCGGCACGTC GACGGTGCGC 
CTGGAGTCGC TCGCGTTCGG CGCGGCGCGG CTGCCGGTGG 
GACGCCGCGG ACGACCTCAT AACGCCCGCG CACTGGCTGC 
CTGTTCTCGG ATGCCGTCCG GGAGCTGGCC GACCGCGGCG 
GGCCCCTCCG GCTCCCTGGC GTCGGCCGCG GCGGAGAGCG 
TACCACGCGG TGCTGCGCGC CCGGACCGGT GAGGAGACCG 
GAGCTGCACG CCCACGGCGT CCCGGTCGAC CTGGCCGCGG 
GTGGACCTTC CCGTGTACGC GTTCCAGCAC CGTTCCTACT 
GGGGCGCCGG CCACCGTGGC GGACACCGGG GGTCCGGCGG 
ACCGTCGCCG AGATCGTCCG TCGGCGCACC GCGGCGCTGC 
GACGTCGATG CGGAAGCGAC GTTCTTCGCG CTCGGTTTCG 
CTGCGCAACC AGCTCGCCTC GGCAACCGGG CTGGACCTGC 
CACGACACCC CGGCCGCGCT CACCGCGTTC CTCCAGGACC 
CGGATCGAGG CCGGCGAGGA CGACGACGCG CCCACCGTGC 
GAGTCGCTCG ACGCCGCGGA CATCGCGGCG ACGCCGGCCC 
GATCTGCTCG ACAAGCTCGC CCATACCTGG AAGGACTACC 
AGGGAACGCC GCCCGCCGGC CGCTGCCCAT TCGCGATCCA 
TGGAGAGCGG CACGGTGGGT TCGTTCGACC TGTTCGGCGT 
CCGCCGAGGA CGTCAAGCTG GTCACCAACG ATCCGCGGTT 
AGATGCTGCC CGACCGGCGG CCCGGCTGGT TCTCCGGGAT 
GCTACCGGCA GAAGATCGCG GGGGACTTCA CACTGCGCGC 
TCGTCGCCGA GGCCGCCGAC GCCTGCCTGG ACGACATCGA 
ACCTCATCCC CGGGTACGCC AAGCGGCTGC CCTCCCTCGT 
TCACCCCTGA GGAGGGGGCC GTGCTGGAGG CACGGATGCG 
ATCTGGACAG . CGTCAAGACG CTGACCGACG ACTTCTTCGG 
GCGCGAAGCG TGACGAGCGG GGCGAGG AC C TGCTGCACCG 
GCGAGATCTC GCTCAGCGAC GACGAGGCGA CGGGCGTGTT 
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72121 CGCGACGCTG CTGTTCGCCG 
72181 CGCACTGCTC AGCCACCCCG 
7 224 1 CAACGCGGTC GAGGAGATGC 
7 2 301 CTGTGTCGAG GACGTCGATG 
5 72361 GCTCTACTCG ACGGCCAACC 
724 21 GACGCGCCCG CTGGAGGGCA 
724 81 GCACATCGCC CGGGTGCTCA 
7 2 541 CGTCCGGCTG GCCGGCGACG 
7 2 601 GCTGCGGGTC ACCTGGGGGG 

10 7 2 661 GGGACGACGG TCGCGCACAT 
7 2721 ACCCAGCGCT GCTACCTGCG 
72781 GTCGGCGCGA ACATCGGCAT 
7 2841 GTGCACGCCT TCGAGCCCGC 
7 2901 CACGGCATCC CGGGCCAGGC 

15 7 2 961 ATGACCTTCT ATCCCGACGC 
7 3021 ACGGAGCTGT TGCGCACGCT 
7 3081 ATGCTCGCGC AACTGCCCGA 
7 314 1 GACGTCATCG CGGAGCGCGG 
7 3201 AGCGAACGGC AGGTCTTCGC 

20 7 3261 GTCGCGGAGG TCCACGACAT 
7 3321 CATGGCTTCA CCGTGGTCGC 
7 3 381 GTCGCCGCGC GGCGGGTGGC 
7 3441 GCCGCGGTGC GGACGGCGGC 
7 3501 CCCTTCACCC CCAGCTTGCG 

25 7 3561 ACGAACAGCT GGCTGGCGAT 
7 3621 CGCCGCTCCG CCTCGGTCAG 
7 3681 TCCGCGTCCG AGGACTCCCC 
7 3741 GCGAGGTGCC GTGCGCGGCG 
7 3801 CACGCTTCGC CCATGTCGGC 

30 7 3861 AGCAGATCGG CGGCCTCGTC 
7 3921 TGCACCCGCA GCGTCATCAC 
7 3981 ATGAGCCTCA GCCCCTCGTC 
7 4 041 ACCCGCCACA GGGCCAGGCC 
7 4101 TCCCGGAACG CGTTGTACGC 

35 7 4 161 GCCCAGACCA TGTGCAGTCC 
7 4 221 AGCCACCGCT CCGCCCGGTC 
7 4 281 AGCGGCAATG CGGCGGCCAT 
7 4 341 CCGCATTCGA CGGCGGCGGT 
7 4 401 GCGTGGACCG CCTCGTCGGC 

40 74 4 61 CAGGACTGGA CGGCATCGGT 
7 4 521 GTGGTCCGGT CCGTCGTGAC 
7 4 581 TGTTCGGACC AGCCGCGCAG 
7 4 641 ACGGCTCCGG AAAACGAGGC 
7 4 701 TCGGCCGCGC CGGGATAGAT 

45 7 4 761 CCCTGCTCGC TCGGGGCGGC 
7 4 821 CGCCCGTCCA TCGCCAGCCA 
7 4 881 TCCCGCGACG CGGTGAGCAG 
7 4 941 CGCTCGATGG CGGCGGTGTC 
7 5001 CGGTAGGCGA ACTCCAGGTA 

50 7 5061 CGCGCGGCGT CGGTGAACAG 
7 5121 TGGTGGCGGG CGAGCACCTT 
7 5181 TCGTGCAGGC CACGCCGCTC 
7 5241 GGGTGCGGGA ACCGCCCTTC 
7 5 301 TCGACCGCCT CGGTGTCGAG 

55 7 5361 CCGAGCACGG CGGAAGCTCG 
7 5421 CCGAGGTAGG CGAGCCGGTA 
7 5481 GTCCGTGCCT CCCGGATGTC 
7 5541 GCCCGGAACG CCTGGGCCAC 
7 5601 AGTTCGGTGG TCTGCGCCTC 

60 7 5 661 CTCAGCAGTG CCGCCCGGAA 
7 5721 ACGATGGCGA CACGGGCCCG 
75781 GGCGCGTCGG CGTGGTGCAC 
7 5841 GTCAGCACCG TGCGGGTGAG 
7 5901 TCGCACGATG CCGTCAGCCG 
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GCCACGACTC GGTGCAGCAG ATGGTCGGCT ACTGCCTCTA 
AGCAGCAGGC GGCGCTGCGC GCGCGCCCGG AGCTGGTCGA 
TCCGTTTCCT GCCCGTCAAC CAGATGGGCG TACCGCGCGT 
TGCGGGGCGT GCGCATCCGT GCGGGCGACA ACGTGATCCC 
GCGACCCCGA GGTGTTCCCG CAGCCCGACA CCTTCGATGT 
ACTTCGCGTT CGGCCACGGC ATTCACAAGT GTCCCGGCCA 
TCAAGGTCGC CTGCCTGCGG TTGTTCGAGC GTTTCCCGGA 
TGCCGATGAA CGAGGGGCTC GGGCTGTTCA GCCCGGCCGA 
CGGCATGAGT CACCCGGTGG AGACGTTGCG GTTGCCGAAC 
CAACGCGGGC GAGGCGCAGT TCCTCTACCG GGAGATCTTC 
CCACGGTGTC GACCTGCGCC CGGGGGACGT GGTGTTCGAC 
GTTCACGCTT TTCGCGCATC TGGAGTGTCC TGGTGTGACC 
GCCCGTGCCG TTCGCGGCGC TGCGGGCGAA CGTGACGCGG 
GGACCAGTGC GCGGTCTCCG ACAGCTCCGG CACCCGGAAG 
CACGCTGATG TCCGGTTTCC ACGCGGATGC CGCGGCCCGG 
CGGCCTCAAC GGCGGCTACA CCGCCGAGGA CGTCGACACC 
CGTCAGCGAG GAGATCGAAA CCCCTGTGGT CCGGCTCTCC 
TATCGAGGCC ATCGGCCTGC TGAAGGTCGA CGTGGAGAAG 
CGGCCTCGAG GACACCGACT GGCCCCGTAT CCGCCAGGTC 
CGACGGCGCG CTCGAGGAGG TCGTCACGCT GCTCCGCGGC 
CGAGCAGGAA CCGCTGTTCG CCGGCACGGG CATCCACCAG 
CGGCTGAGCG CCGTCGGGGC CGCGGCCGTC CGCACCGGCG 
TCAGCCGGCG TCGGACAGTT CCTTGGGCAG TTGCTGACGG 
GAACACGTTG GTGAGGTGCT GTTCCACCGT GCTGGA^GVG 
CTCCTTGTTG GTGCGCCCGA CCGCGGCGTG CGACGCCACC 
CGATGTGATC CGCTGCGCCG GCGTCACGTC CTGGGTGCCG 
PlCCGPlGCCGC CGGAGGAGCG GCACGGCTCC GCACTGGGTC 
GAACAGTCCC CGCGCACGGC TGTGCCGCCG GAGCATGCCG 
GAGGACGCGG GCCAGCTCGT ACTGGTCGCG GCACATGATG 
GAGCAGTTCG ATCCGCTTGG CCGGCGGACT GTAGGCCGCC 
CCGCGCCCGG GACCCCATCG GCCGGGACAG CTGCTCGGAG 
ACGGCCGCGG CCGAGCAGCA GAAGCGCTTC GGCGGCGTCG 
CGGCACGTCG ACGGACCAGC GTCGCATCCG CTCCCCGCAG 
CGCCCGGTAC CGCCCGGCCG CGAGATGGTG TTGCCCACGG 
GAAGAGGCTG TCGGAGGTCT CCTCCGGCAA CGGCTCGGCG 
CAGGTCGCCC AGTCGGATCG CGGCGGCCAC GGTGCTGCTC 
CCCCCAGGAG GGCACGACCC GGGGGGCGAG CGCGGCCTCG 
CAGGTCGCCG CGGCGCAGCG CGGCCTCGGC GCGGAACCCC 
CGGGGTCCGC ATGTTGTCGT CACCGGCCAG CTTGTCGACC 
GTCCTCGGCG TAGAGCAGGG CCAGCAACGC CATCATGGTC 
CCGGGAGTGC TGGAGCACGT ACTCGGCTTT GGCCTCGGCC 
CGCGTTGCTC AGGGCCTTGT CGGCGACGGC GCGGTGCCGG 
GACCTCGTCC TCGGCCGGCG GATCGGCCGG ACGCGGHGGA 
CAGCGCGAGG GACAGGTCCG CGACGCGCAG GTGCGCCCGG 
GGAGCGCTGG GCCGCCAGGA CCTCGGCGGC CTCGCCCGGC 
GCAGGCGAGC GACACGGCGT GCTCGCTGGA GAGGAGCCGT 
CTCGGGCACA TGCCGGCCGG ATCTGGCGGG ATCGCAGAGC 
GACGCGCAGT GCGGCGTGGA CGGCGGGGTC GTCGGAGGCC 
GGTGACGGCC TCGTCGAGCT CGCCGCGCAG GTGGTGCTCG 
CCCGGCGACC TCGGCGCCGT GCACCCGGCC GGTACCCATC 
GCTGGCCACG CCGCGGTCCC GCAGCAGTTC CAGCGCCAGC 
GGCGGCGGAG AGGTCGTCGA GTACGACGGA GCGGGCCGCG 
CCGCAGCAGC CGCCCCTCGA CCAGCTGTTC GTGGGCCTGC 
GCCGGTCATC CGCTGGACGA GGGTGAGTTC GACACTCTCG 
GGCGACGCTC AGCGCGGCCG GGCCGCAACG ATAGAGCGAC 
CGCCCGCCCC GCGACCACTT CCAGGCACCC TGAGGTCCGT 
GTCGATCAGG CCGTGGCCGA GGAGCAGGTT GCCGCCGGTC 
CACGTCGTCG TGCGCGTCCT GGCCGAGGTG CCGGCGCACG 
GGTGAGCGGG CGCAGCGCGA TCTCCTGGTA GTGGCGCAGA 
TTGGGAGTGG GCGGGCGTCG GCCGGAGCAG CTCGGTCAGC 
GCTGATGCGG CGCGCGAGGT GGAGCAGGCA GCGCAGCGAC 
GTCGTCGATG CCGATCAGTA CGGGCCGCTC CGCGGCGAGC 
TTCGGTCCCC AGGCGGTTGT CGACGTCGGC CGGCAGGTTT 
GACCAGCTCC GGTGTCCGGG CGGCCAGCTC GGGCTGGTCG 
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AGGAGCTGGC 
AGGGTGACGA 
ATCGGCCCGG 
TGGAGGGAAC 
ACGAATGGAA 
CTGTACGGCT 
GGGCCGTGCC 
CCACCAGCTC 
CCTCCACCGT 
TCTTCGGATC 
CCCACCGGTA 
GCATTTCGTC 
GCAGCTCGTC 
GGCCCGAGAC 
CGCCCATGCT 
CCACGAGGCC 
GGCCGGGGTA 
GGTAGAACGT 
GCGTGGGGAA 
CTGGGGAGCC 
CACGCCCCAT 
GCGATGCACA 
CGGGTGCACC 
GAAGTGGGTA 
CTCGTATCCC 
ACTCGCGCCG 
GGTCAGCTCC 



CGAGCATGCC 
AGCCGGCCTT 
TGACGGCGGC 
CGAACTCGTC 
CTACCTCGCG 
GTGATTCAGC 
GTTCCCTCAG 
GGCGACCCGC 
GGTCGGCGCG 
GTCGTCACCG 
CGTCTCCGCC 
GTCCGCCATC 
GTCGGACGCG 
GATCAGGTGC 
GTGGCCGAAC 
GGCGAGAACA 
CTGCACGGCG 
CGCCGATCCG 
GAACTGCCGC 
CGGAACCGGG 
CCCTCCTCCG 
TCGCGGACCG 
ATCCCCTTGC 
CCGATGATCC 
GAGGTTGACG 
AACGTCGCGC 
CGGATC 



GTACGGCAGG 
GGCCGCGGCG 
GACGACGCCC 
ATCGCGGGCG 
ACCGTCGTGG 
CTGGCGGGAT 
GAGCCGACCG 
TCCTGGTGGT 
GTCGTGTGCC 
ATGCACACCG 
GCGTAGTAGT 
ACATCGGCGC 
AGGTGGTCCT 
GCCACCGGGA 
AGCACCAGCG 
CGCAGGTCGC 
TACACGTCCG 
CCGGCGTGGG 
AGCCAGAGTT 
TGATCTCGGC 
GCGCCAGACA 
CCGACCCGAC 
AGATCAGGCG 
GCTTCACGGA 
CGCAGGTGAC 
GCCCCGGGTG 



GCCCGCTCCT 
GCGTCGAGGA 
CGCCCGCCCC 
ATCAGGTCTG 
AAACCCATAG 
GCTGTGCTAC 
CCCCCGGCGC 
CGACGAGGTA 
CGGCCCAGGC 
TGATCGGCGT 
CCGCCCGCAA 
TCGTCCCGCC 
GGTCGGCGCG 
GCCGCTGGGC 
GACGGTCCAG 
GCACCGCCTC 
CCACCGGGGC 
GCAGCAGCAC 
CCGAGCTCAC 
CAAGTGCTTC 
GAGGACGCCG 
GTCGTCGAGC 
GTTCGCCTCC 
CATCCACAGG 
GATCGTGCCA 
CTCGAACACG 



CCATGGAGCA 
GTTCGGTCTT 
CCGCTCGGGT 
GGGGAGATAA 
GCATCACATG 
AGATGGGAAG 
CACCCGCCGT 
GAAGTGCCCG 
GTGGGCCTGC 
CTCCAGCGGC 
CGGCGCCAGG 
GAGGCCGATG 
CGGCTGCGAC 
CAGCTCGAAC 
CCCCGGCTTC 
CTCGTCGCGG 
GAGCGCACGG 
CACCCGTACC 
CGCACCCCCT 
TCCCGCATCT 
ACTTTGCCGT 
GGGTAGGTCA 
CACGCCTCAC 
TACCGATTGT 
CCCCGACGTG 
ATGTCGGGAT 



CACCGCGCGA 
GCCGCAGGCG 
GAGCGCCCGG 
GCGCGCTATC 
GCTTGTTGAT 
ATGTGA~CTA 
ACCCCCTGGG 
CCGGGGAAGA 
TCCACCGTCG 
GGCGCGGGCT 
ATCAGCGCGC 
ACCGCCGCCA 
GGCGCCCGCC 
GCGAGTGTCG 
AACGCCTCGG 
CGGTCCTGGC 
GCCAGCGGAA 
GGGGCCTCGG 
CGGCCGCGAC 
CCGGGTCGGT 
TGTGCACATT 
CCGACAGCGT 
GATAGTTCGC 
CAAAGGCGTG 
TCACGTAGAC 
CGTCACCGCC 



Those of skill in the art will recognize that, due to the degenerate nature of the 

30 genetic code, a variety of DNA compounds differing in their nucleotide sequences can be 
used to encode a given amino acid sequence of the invention. The native DNA sequence 
encoding the FK-520 PKS of Streptomyces hygroscopicus is shown herein merely to 
illustrate a preferred embodiment of the invention, and the present invention includes 
DNA compounds of any sequence that encode the amino acid sequences of the 

35 polypeptides and proteins of the invention. In similar fashion, a polypeptide can typically 
tolerate one or more amino acid substitutions, deletions, and insertions in its amino acid 
sequence without loss or significant loss of a desired activity. The present invention 
includes such polypeptides with alternate amino acid sequences, and the amino acid 
sequences shown merely illustrate preferred embodiments of the invention. 

40 The recombinant nucleic acids, proteins, and peptides of the invention are many 

and diverse. To facilitate an understanding of the invention and the diverse compounds 
and methods provided thereby, the following general description of the FK-520 PKS 
genes and modules of the PKS proteins encoded thereby is provided. This general 
description is followed by a more detailed description of the various domains and 

45 modules of the FK-520 PKS contained in and encoded by the compounds of the 

invention. In this description, reference to a heterologous PKS refers to any PKS other 
than the FK-520 PKS. Unless otherwise indicated, reference to a PKS includes reference 
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to a portion of a PKS. Moreover, reference to a domain, module, or PKS includes 
reference to the nucleic acids encoding the same and vice-versa, because the methods 
and reagents of the invention provide or enable one to prepare proteins and the nucleic 
acids that encode them. 
5 The FK-520 PKS is composed of three proteins encoded by three genes 

designated fkbA y JkbB y and fkbC. The flcbA ORF encodes extender modules 7 - 10 of the 
PKS. The fkbB ORF encodes the loading module (the CoA ligase) and extender modules 
1 - 4 of the PKS. The fkbC ORF encodes extender modules 5 - 6 of the PKS. The JkbP 
ORF encodes the NRPS that attaches the pipecolic acid and cyclizes the FK-520 
1 0 polyketide. 

The loading module of the FK-520 PKS includes a CoA ligase, an ER domain, 
and an ACP domain. The starter building block or unit for FK-520 is believed to be a 
dihydroxycyclohexene carboxylic acid, which is derived from shikimate. The 
recombinant DNA compounds of the invention that encode the loading module of the 

15 FK-520 PKS and the corresponding polypeptides encoded thereby are useful for a 
variety of methods and in a variety of compounds. In one embodiment, a DNA 
compound comprising a sequence that encodes the FK-520 loading module is inserted 
into a DNA compound that comprises the coding sequence for a heterologous PKS. The 
resulting construct, in which the coding sequence for the loading module of the 

20 heterologous PKS is replaced by the coding sequence for the FK-520 loading module, 
provides a novel PKS coding sequence. Examples of heterologous PKS coding 
sequences include the rapamycin, FK-506, rifamycin, and avermectin PKS coding 
sequences. In another embodiment, a DNA compound comprising a sequence that 
encodes the FK-520 loading module is inserted into a DNA compound that comprises the 

25 coding sequence for the FK-520 PKS or a recombinant FK-520 PKS that produces ?n 
FK-520 derivative. 

In another embodiment, a portion of the loading module coding sequence is 
utilized in conjunction with a heterologous coding sequence. In this embodiment, the 
invention provides, for example, either replacing the CoA ligase with a different CoA 

30 ligase, deleting the ER, or replacing the ER with a different ER. In addition, or 
alternatively, the ACP can be replaced by another ACP. In similar fashion, the 
corresponding domains in another loading or extender module can be replaced by one or 
more domains of the FK-520 PKS. The resulting heterologous loading module coding 
sequence can be utilized in conjunction with a coding sequence for a PKS that 

35 synthesizes FK-520, an FK-520 derivative, or another polyketide. 
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The first extender module of the FK-520 PKS includes a KS domain, an AT 
domain specific for methylmalonyl CoA, a DH domain, a KR domain, and an ACP 
domain. The recombinant DNA compounds of the invention that enc de the first 
extender module of the FK-520 PKS and the corresponding polypeptides encoded 
5 thereby are useful for a variety of applications. In one embodiment, a DNA compound 
comprising a sequence that encodes the FK-520 first extender module is inserted into a 
DNA compound that comprises the coding sequence for a heterologous PKS. The 
resulting construct, in which the coding sequence for a module of the heterologous PKS 
is either replaced by that for the first extender module of the FK-520 PKS or the latter is 

10 merely added to coding sequences for modules of the heterologous PKS, provides a 
novel PKS coding sequence. In another embodiment, a DNA compound comprising a 
sequence that encodes the first extender module of the FK-520 PKS is inserted into a 
DNA compound that comprises the remainder of the coding sequence for the FK-520 
PKS or a recombinant FK-520 PKS that produces an FK-520 derivative. 

1 5 In another embodiment, all or only a portion of the first extender module coding 

sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 
methylmalonyl CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2- 
hydroxymalonyl CoA specific AT; deleting either the DH or KR or both; replacing the 

20 DH or KR or both with another DH or KR; and/or inserting an ER. In replacing or 

inserting KR, DH, and ER domains, it is often beneficial to replace the existing KR, DH, 
and ER domains with the complete set of domains desired from another module. Thus, if 
one desires to insert an ER domain, one may simply replace the existing KR and DH 
domains with a KR, DH, and ER set of domains from a module containing such 

25 domains. In addition, the KS and/or ACP can be replaced with another KS and/or ACP. 
In each of these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or 
ACP coding sequence can originate from a coding sequence for another module of the 
FK-520 PKS, from a gene for a PKS that produces a polyketide other than FK-520, or 
from chemical synthesis. The resulting heterologous first extender module coding 

30 sequence can be utilized in conjunction with a coding sequence for a PKS that 

synthesizes FK-520, an FK-520 derivative, or another polyketide. In similar fashion, the 
corresponding domains in a module of a heterologous PKS can be replaced by one or 
more domains of the first extender module of the FK-520 PKS. 

In an illustrative embodiment of this aspect of the invention, the invention 

35 provides recombinant PKSs and recombinant DNA compounds and vectors that encode 
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such PKSs in which the KS domain of the first extender module has been inactivated. 
Such constructs are especially useful when placed in translational reading frame with the 
remaining modules and domains of an FK-520 or FK-520 derivative PKS. The utility of 
these constructs is that host cells expressing, or cell free extracts containing, the PKS 
5 encoded thereby can be fed or supplied with N-acylcysteamine thioesters of novel 
precursor molecules to prepare FK-520 derivatives. See U.S. patent application Serial 
No. 60/1 17,384, filed 27 Jan. 1999, and PCT patent publication Nos. US97/02358 and 
US99/03986, each of which is incorporated herein by reference. 

The second extender module of the FK-520 PKS includes a KS, an AT specific 

10 for methylmalonyl CoA, a KR, an inactive DH, and an ACP. The recombinant DNA 

compounds of the invention that encode the second extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of 
applications. In one embodiment, a DNA compound comprising a sequence that encodes 
the FK-520 second extender module is inserted into a DNA compound that comprises the 

15 coding sequence for a heterologous PKS. The resulting construct, in which the coding 
sequence for a module of the heterologous PKS is either replaced by that for the second 
extender module of the FK-520 PKS or the latter is merely added to coding sequences 

4 

for the modules of the heterologous PKS, provides a novel PKS coding sequence. In 
another embodiment, a DNA compound comprising a sequence that encodes the second 
20 extender module of the FK-520 PKS is inserted into a DNA compound that comprises 

the coding sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS 

A 

that produces an FK-520 derivative. 

In another embodiment, all or a portion of the second extender module coding 
sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 

25 module. In this embodiment, the invention provides, for example, either replacing the 
methylmalonyl CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2- 
hydroxymalonyl CoA specific AT; deleting the KR and/or the inactive DH; replacing the 
KR with another KR; and/or inserting an active DH or an active DH and an ER. In 
addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 

30 these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or ACP coding 
sequence can originate from a coding sequence for another module of the FK-520 PKS, 
from a coding sequence for a PKS that produces a polyketide other than FK-520, or from 
chemical synthesis. The resulting heterologous second extender module coding sequence 
can be utilized in conjunction with a coding sequence from a PKS that synthesizes FK- 

35 520, an FK-520 derivative, or another polyketide. In similar fashion, the corresponding 
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domains in a module of a heterologous PKS can be replaced by one or more domains of 
the second extender module of the FK-520 PKS. 

The third extender module of the FK-520 PKS includes a KS, an AT specific for 
malonyl CoA, a KR, an inactive DH, and an ACP. The recombinant DNA compounds of 
5 the invention that encode the third extender module of the FK-520 PKS and the 

corresponding polypeptides encoded thereby are useful for a variety of applications. In 
one embodiment, a DNA compound comprising a sequence that encodes the FK-520 
third extender module is inserted into a DNA compound that comprises the coding 
sequence for a heterologous PKS. The resulting construct, in which the coding sequence 

10 for a module of the heterologous PKS is either replaced by that for the third extender 
module of the FK-520 PKS or the latter is merely added to coding sequences for the 
modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the third extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 

1 5 sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, all or a portion of the third extender module coding 
sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 

20 malonyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 2- 

hydroxymalonyl CoA specific AT; deleting the KR and/or the inactive DH; replacing the 
KR with another KR; and/or inserting an active DH or an active DH and an ER. In 
addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or ACP coding 

25 sequence can originate from a coding sequence for another module of the FK-520 PKS, 
from a coding sequence for a PKS that produces a polyketide other than FK-520, or from 
chemical synthesis. The resulting heterologous third extender module coding sequence 
can be utilized in conjunction with a coding sequence from a PKS that synthesizes FK- 
520, an FK-520 derivative, or another polyketide. In similar fashion, the corresponding 

30 domains in a module of a heterologous PKS can be replaced by one or more domains of 
the third extender module of the FK-520 PKS. 

The fourth extender module of the FK-520 PKS includes a KS, an AT that binds 
ethylmalonyl CoA, an inactive DH, and an ACP. The recombinant DNA compounds of 
the invention that encode the fourth extender module of the FK-520 PKS and the 

35 corresponding polypeptides encoded thereby are useful for a variety of applications. In 
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one embodiment, a DNA compound comprising a sequence that encodes the FK-520 
fourth extender module is inserted into a DNA compound that comprises the coding 
sequence for a heterologous PKS. The resulting construct, in which the coding sequence 
for a module of the heterologous PKS is either replaced by that for the fourth extender 
5 module of the FK-520 PKS or the latter is merely added to coding sequences for the 
modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the fourth extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the 
remainder of the coding sequence for the FK-520 PKS or a recombinant FK-520 PKS 
1 0 that produces an FK-520 derivative. 

In another embodiment, a portion of the fourth extender module coding sequence 
is utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the ethylmalonyl 
CoA specific AT with a malonyl CoA, methylmalonyl CoA, or 2-hydroxymalonyl CoA 
1 5 specific AT; and/or deleting the inactive DH, inserting a KR, a KR and an active DH, or 
a KR, an active DH, and an ER. In addition, the KS and/or ACP can be replaced with 
another KS and/or ACP. In each of these replacements or insertions, the heterologous 
KS, AT, DH, KR, ER, or ACP coding sequence can originate from a coding sequence for 
another module of the FK-520 PKS, a PKS for a polyketide other than FK-520, or from 
20 chemical synthesis. The resulting heterologous fourth extender module coding sequence 
can be utilized in conjunction with a coding sequence for a PKS that synthesizes FK-520, 
an FK-520 derivative, or another polyketide. In similar fashion, the corresponding 
domains in a module of a heterologous PKS can be replaced by one or more domains of 
the fourth extender module of the FK-520 PKS. 
25 As illustrative examples, the present invention provides recombinant genes, 

vectors, and host cells that result from the conversion of the FK-506 PKS to an FK-520 
PKS and vice-versa. In one embodiment, the invention provides a recombinant set of 
FK-506 PKS genes but in which the coding sequences for the fourth extender module or 
at least those for the AT domain in the fourth extender module have been replaced by 
30 those for the AT domain of the fourth extender module of the FK-520 PKS. This 

recombinant PKS can be used to produce FK-520 in recombinant host cells. In another 
embodiment, the invention provides a recombinant set of FK-520 PKS genes but in 
which the coding sequences for the fourth extender module or at least those for the AT 
domain in the fourth extender module have been replaced by those for the AT domain of 
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the fourth extender module of the FK-506 PKS. This recombinant PKS can be used to 
produce FK-506 in recombinant host cells. 

Other examples of hybrid PKS enzymes of the invention include those in which 
the AT domain of module 4 has been replaced with a malonyl specific AT domain to 
5 provide a PKS that produces 2 1 -desethy 1-FK520 or with a methylmalonyl specific AT 
domain to provide a PKS that produces 21-desethyl-21-methyl-FK520. Another hybrid 
PKS of the invention is prepared by replacing the AT and inactive KR domain of Fk-520 
extender module 4 with a methylmalonyl specific AT and an active KR domain, such as, 
for example, from module 2 of the DEBS or oleandolide PKS enzymes, to produce 21- 
1 0 desethyl-2 i -methyl-22-desoxo-22-hydroxy-FK520. The compounds produced by these 
hybrid PKS enzymes are neurotrophins. 

The fifth extender module of the FK-520 PKS includes a KS, an AT that binds 
methylmalonyl CoA, a DH, a KR, and an ACP. The recombinant DNA compounds of 
the invention that encode the fifth extender module of the FK-520 PKS and the 
1 5 corresponding polypeptides encoded thereby are useful for a variety of applications. In 
one embodiment, a DNA compound comprising a sequence that encodes the FK-520 
fifth extender module is inserted into a DNA compound that comprises the coding 
sequence for a heterologous PKS. The resulting construct, in which the coding sequence 
for a module of the heterologous PKS is either replaced by that for the fifth extender 
20 module of the FK-520 PKS or the latter is merely added to coding sequences for the 
modules of the heterologous PKS, provides a novel PKS. In another embodiment, a 
DNA compound comprising a sequence that encodes the fifth extender module of the 
FK-520 PKS is inserted into a DNA compound that comprises the coding sequence for 
the FK-520 PKS or a recombinant FK-520 PKS that produces an FK-520 derivative. 
25 In another embodiment, a portion of the fifth extender module coding sequence is 

utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the 
methylmalonyl CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2- 
hydroxymalonyl CoA specific AT; deleting any one or both of the DH and KR; replacing 
30 any one or both of the DH and KR with either a KR and/or DH; and/or inserting an ER. 
In addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or ACP coding 
sequence can originate from a coding sequence for another module of the FK-520 PKS, 
from a coding sequence for a PKS that produces a polyketide other than FK-520, or from 
35 chemical synthesis. The resulting heterologous fifth extender module coding sequence 
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can be utilized in conjunction with a coding sequence for a PKS that synthesizes FK-520, 
an FK-520 derivative, or another polyketide. In similar fashion, the corresponding 
domains in a module of a heterologous PKS can be replaced by one or more domains of 
the fifth extender module of the FK-520 PKS. 
5 In an illustrative embodiment, the present invention provides a set of recombinant 

FK-520 PKS genes in which the coding sequences for the DH domain of the fifth 
extender module have been deleted or mutated to render the DH non-functional. In one 
such mutated gene, the KR and DH coding sequences are replaced with those encoding 
only a KR domain from another PKS gene. The resulting PKS genes code for the 

10 expression of an FK-520 PKS that produces an FK-520 analog that lacks the C-19 to C- 
20 double bond of FK-520 and has a C-20 hydroxyl group. Such analogs are preferred 
neurotrophins, because they have little or no immunosuppressant activity. This 
recombinant fifth extender module coding sequence can be combined with other coding 
sequences to make additional compounds of the invention. In an illustrative embodiment, 

15 the present invention provides a recombinant FK-520 PKS that contains both this fifth 
extender module and the recombinant fourth extender module described above that 
comprises the coding sequence for the fourth extender module AT domain of the FK-506 
PKS. The invention also provides recombinant host cells derived from FK-506 
producing host cells that have been mutated to prevent production of FK-506 but that 

20 express this recombinant PKS and so synthesize the corresponding (lacking the C-19 to 
C-20 double bond of FK-506 and having a C-20 hydroxyl group) FK-506 derivative. In 
another embodiment, the present invention provides a recombinant FK-506 PKS in 
which the DH domain of module 5 has been deleted or otherwise rendered inactive and 
thus produces this novel polyketide. 

25 The sixth extender module of the FK-520 PKS includes a KS, an AT specific for 

methylmalonyl CoA, a KR, a DH, an ER, and an ACP. The recombinant DNA 
compounds of the invention that encode the sixth extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of 
applications. In one embodiment, a DNA compound comprising a sequence that encodes 

30 the FK-520 sixth extender module is inserted into a DNA compound that comprises the 
coding sequence for a heterologous PKS. The resulting construct, in which the coding 
sequence for a module of the heterologous PKS is either replaced by that for the sixth 
extender module of the FK-520 PKS or the latter is merely added to coding sequences 
for the modules of the heterologous PKS, provides a novel PKS coding sequence. In 

35 another embodiment, a DNA compound comprising a sequence that encodes the sixth 
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extender module of the FK-520 PKS is inserted into a DNA compound that comprises 
the coding sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS 
that produces an FK-520 derivative. 

In another embodiment, a portion of the sixth extender module coding sequence 
5 is utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the 
methylmalonyl Co A specific AT with a malonyl Co A, ethylmalonyl Co A, or 2- 
hydroxymalonyl CoA specific AT; deleting any one, two, or all three of the KR, DH, and 
ER; and/or replacing any one, two, or all three of the KR, DH, and ER with another KR, 

10 DH, and ER. In addition, the KS and/or ACP can be replaced with another KS and/or 
ACP. In each of these replacements, the heterologous KS, AT, DH, KR, ER, or ACP 
coding sequence can originate from a coding sequence for another module of the FK-520 
PKS, from a coding sequence for a PKS that produces a polyketide other than FK-520, or 
from chemical synthesis. The resulting heterologous sixth extender module coding 

15 sequence can be utilized in conjunction with a coding sequence for a PKS that 

synthesizes FK-520, an FK-520 derivative, or another polyketide. In similar fashion, the 
corresponding domains in a module of a heterologous PKS can be replaced by one or 
more domains of the sixth extender module of the FK-520 PKS. 

In an illustrative embodiment, the present invention provides a set of recombinant 

20 FK-520 PKS genes in which the coding sequences for the DH and ER domains of the 
sixth extender module have been deleted or mutated to render them non-functional. In 
one such mutated gene, the KR, ER, and DH coding sequences are replaced with those 
encoding only a KR domain from another PKS gene. This can also be accomplished by 
simply replacing the coding sequences for extender module six with those for an 

25 extender module having a methylmalonyl specific AT and only a KR domain from a 

heterologous PKS gene, such as, for example, the coding sequences for extender module 
two encoded by the eryAI gene. The resulting PKS genes code for the expression of an 
FK-520 PKS that produces an FK-520 analog that has a C- 18 hydroxyl group. Such 
analogs are preferred neurotrophins, because they have little or no immunosuppressant 

30 activity. This recombinant sixth extender module coding sequence can be combined with 
other coding sequences to make additional compounds of the invention. In an illustrative 
embodiment, the present invention provides a recombinant FK-520 PKS that contains 
both this sixth extender module and the recombinant fourth extender module described 
above that comprises the coding sequence for the fourth extender module AT domain of 

35 the FK-506 PKS. The invention also provides recombinant host cells derived from FK- 
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506 producing host cells that have been mutated to prevent production of FK-506 but 
that express this recombinant PKS and so synthesize the corresponding (having a C-l 8 
hydroxyl group) FK-506 derivative. In another embodiment, the present invention 
provides a recombinant FK-506 PKS in which the DH and ER domains of module 6 have 
5 been deleted or otherwise rendered inactive and thus produces this novel polyketide. 

The seventh extender module of the FK-520 PKS includes a KS, an AT specific 
for 2-hydroxymalonyl CoA, a KR, a DH, an ER, and an ACP. The recombinant DNA 
compounds of the invention that encode the seventh extender module of the FK-52C PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of 

10 applications. In one embodiment, a DNA compound comprising a sequence that encodes 
the FK-520 seventh extender module is inserted into a DNA compound that comprises 
the cuding sequence for a heterologous PKS. The resulting construct, in which the 
coding sequence for a module of the heterologous PKS is either replaced by that for the 
seventh extender module of the FK-520 PKS or the latter is merely added to coding 

15 sequences for the modules of the heterologous PKS, provides a novel PKS coding 
sequence. In another embodiment, a DNA compound comprising a sequence that 
encodes the seventh extender module of the FK-520 PKS is inserted into a DNA 
compound that comprises the coding sequence for the remainder of the FK-520 PKS or a 
recombinant FK 520 PKS that produces an FK-520 derivative. 

20 In another embodiment, a portion or all of the seventh extender module coding 

sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 2- 
hydroxymalonyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 
malonyl CoA specific AT; deleting the KR, the DH, and/or the ER; and/or replacing the 

25 KR, DH, and/or ER. In addition, the KS and/or ACP can be replaced with another KS 
and/or ACP. In each of these replacements or insertions, the heterologous KS, AT, DH, 
KR, ER, or ACP coding sequence can originate from a coding sequence for another 
module of the FK-520 PKS, from a coding sequence for a PKS that produces a 
polyketide other than FK-520, or from chemical synthesis. The resulting heterologous 

30 seventh extender module coding sequence can be utilized in conjunction with a coding 
sequence for a PKS that synthesizes FK-520, an FK-520 derivative, or another 
polyketide. In similar fashion, the corresponding domains in a module of a heterologous 
PKS can be replaced by one or more domains of the seventh extender module of the FK- 
520 PKS. 
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In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the AT domain of the seventh 
extender module has been replaced with those encoding an AT domain for malonyl, 
methylmalonyl, or ethylmalonyl CoA from another PKS gene. The resulting PKS genes 
5 code for the expression of an FK-520 PKS that produces an FK-520 analog that lacks the 
C-15 methoxy group, having instead a hydrogen, methyl, or ethyl group at that position, 
respectively. Such analogs are preferred, because they are more slowly metabolized than 
FK-520. This recombinant seventh extender module coding sequence can be combined 
with other coding sequences to make additional compounds of the invention. In an 

10 illustrative embodiment, the present invention provides a recombinant FK-520 PKS that 
contains both this seventh extender module and the recombinant fourth extender module 
described above that comprises the coding sequence for the fourth extender module AT 
domain of the FK-506 PKS. The invention also provides recombinant host cells derived 
from FK-506 producing host cells that have been mutated to prevent production of FK- 

15 506 but that express this recombinant PKS and so synthesize the corresponding (C-15- 
desmethoxy) FK-506 derivative. In another embodiment, the present invention provides 
a recombinant FK-506 PKS in which the AT domain of module 7 has been replaced and 
thus produces this novel polyketide. 

In another illustrative embodiment, the present invention provides a hybrid TKS 

20 in which the AT and KR domains of module 7 of the FK-520 PKS are replaced by a 
methylmalonyl specific AT domain and an inactive KR domain, such as, for example, 
the AT and KR domains of extender module 6 of the rapamycin PKS. The resulting 
hybrid PKS produces 15-desmethoxy-15-methyl-16-oxo-FK-520, a neurotrophin 
compound. 

25 The eighth extender module of the FK-520 PKS includes a KS, an AT specific 

for 2-hydroxymalonyl CoA, a KR, and an ACP. The recombinant DNA compounds of 
the invention that encode the eighth extender module of the FK-520 PKS and the 
corresponding polypeptides encoded thereby are useful for a variety of applications. In 
one embodiment, a DNA compound comprising a sequence that encodes the FK-520 

30 eighth extender module is inserted into a DNA compound that comprises the coding 

sequence for a heterologous PKS. The resulting construct, in which the coding sequence 
for a module of the heterologous PKS is either replaced by that for the eighth extender 
module of the FK-520 PKS or the latter is merely added to coding sequences for the 
modules of the heterologous PKS, provides a novel PKS coding sequence. In another 

35 embodiment, a DNA compound comprising a sequence that encodes the eighth extender 
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module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, a portion of the eighth extender module coding sequence 
5 is utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the 2- 
hydroxymaionyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 
malonyl CoA specific AT; deleting or replacing the KR; and/or inserting a DH or a DH 
and an ER. In addition, the KS and/or ACP can be replaced with another KS and/or \CP. 

10 In each of these replacements, the heterologous KS, AT, DH, KR, ER, or ACP coding 
sequence can originate from a coding sequence for another module of the FK-520 PKS, 
from a coding sequence for a PKS that produces a polyketide other than FK-520, or from 
chemical synthesis. The resulting heterologous eighth extender module coding sequence 
can be utilized in conjunction with a PKS that synthesizes FK-520, an FK-520 

15 derivative, or another polyketide. In similar fashion, the corresponding domains in a 
module of a heterologous PKS can be replaced by one or more domains of the eighth 
extender module of the FK-520 PKS. 

In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the AT domain of the eighth 

20 extender module has been replaced with those encoding an AT domain for malonyl, 

methylmalonyl, or ethylmalonyl CoA from another PKS gene. The resulting PKS genes 
code for the expression of an FK-520 PKS that produces an FK-520 analog that lacks the 
C-13 methoxy group, having instead a hydrogen, methyl, or ethyl group at that position, 
respectively. Such analogs are preferred, because they are more slowly metabolized than 

25 FK-520. This recombinant eighth extender module coding sequence can be combined 
with other coding sequences to make additional compounds of the invention. In an 
illustrative embodiment, the present invention provides a recombinant FK-520 PKS that 
contains both this eighth extender module and the recombinant fourth extender module 
described above that comprises the coding sequence for the fourth extender module AT 

30 domain of the FK-506 PKS. The invention also provides recombinant host cells deri ved 
from FK-506 producing host cells that have been mutated to prevent production of FK- 
506 but that express this recombinant PKS and so synthesize the corresponding (C-13- 
desmethoxy) FK-506 derivative. In another embodiment, the present invention provides 
a recombinant FK-506 PKS in which the AT domain of module 8 has been replaced and 

35 thus produces this novel polyketide. 
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The ninth extender module of the FK-520 PKS includes a KS, an AT specific for 
methylmalonyl CoA, a KR, a DH, an ER, and an ACP. The recombinant DNA 
compounds of the invention that encode the ninth extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of 
5 applications. In one embodiment, a DNA compound comprising a sequence that encodes 
the FK-520 ninth extender module is inserted into a DNA compound that comprises the 
coding sequence for a heterologous PKS. The resulting construct, in which the coding 
sequence for a module of the heterologous PKS is either replaced by that for the ninth 
extender module of the FK-520 PKS or the latter is merely added to coding sequences 
10 for the modules of the heterologous PKS, provides a novel PKS coding sequence. In 
another embodiment, a DNA compound comprising a sequence that encodes the ninth 
extender module of the FK-520 PKS is inserted into a DNA compound that comprises 
the coding sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS 
that produces an FK-520 derivative. 

15 In another embodiment, a portion of the ninth extender module coding sequence 

is utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the 
methylmalonyl CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2- 
hydroxymalonyl CoA specific AT; deleting any one, two, or all three of the KR, DH, and 

20 ER; and/or replacing any one, two, or all three of the KR, DH, and ER with another KR, 
DH, and/or ER. In addition, the KS and/or ACP can be replaced with another KS and/or 
ACP. In each of these replacements, the heterologous KS, AT, DH, KR, ER, or ACP 
coding sequence can originate from a coding sequence for another module of the FK-520 
PKS, from a coding sequence for a PKS that produces a polyketide other than FK-520, or 

25 from chemical synthesis. The resulting heterologous ninth extender module coding 

sequence can be utilized in conjunction with a PKS that synthesizes FK-520, an FK-520 
derivative, or another polyketide. In similar fashion, the corresponding domains in a 
module of a heterologous PKS can be replaced by one or more domains of the ninth 
extender module of the FK-520 PKS. 

30 The tenth extender module of the FK-520 PKS includes a KS, an AT specific for 

malonyl CoA, and an ACP. The recombinant DNA compounds of the invention that 
encode the tenth extender module of the FK-520 PKS and the corresponding 
polypeptides encoded thereby are useful for a variety of applications. In one 
embodiment, a DNA compound comprising a sequence that encodes the FK-520 tenth 

35 extender module is inserted into a DNA compound that comprises the coding sequence 
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for a heterologous PKS. The resulting construct, in which the coding sequence for a 
module of the heterologous PKS is either replaced by that for the tenth extender module 
of the FK-520 PKS or the latter is merely added to coding sequences for the modules of 
the heterologous PKS, provides a novel PKS coding sequence. In another embodiment, a 
5 DNA compound comprising a sequence that encodes the tenth extender module of the 
FK-520 PKS is inserted into a DNA compound that comprises the coding sequence for 
the remainder of the FK-520 PKS or a recombinant FK-520 PKS that produces an FK- 
520 derivative. 

In another embodiment, a portion or all of the tenth extender module coding 

10 sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 
malonyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 2- 
hydroxymalonyl CoA specific AT; and/or inserting a KR, a KR and DH, or a KR, DH, 
and an ER. In addition, the KS and/or ACP can be replaced with another KS and/or ACP. 

15 In each of these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or 
ACP coding sequence can originate from a coding sequence for another module of the 
FK-520 PKS, from a coding sequence for a PKS that produces a polyketide other than 
FK-520, or from chemical synthesis. The resulting heterologous tenth extender module 
coding sequence can be utilized in conjunction with a coding sequence for a PKS that 

20 synthesizes FK-520, an FK-520 derivative, or another polyketide. In similar fashion the 
corresponding domains in a module of a heterologous PKS can be replaced by one or 
more domains of the tenth extender module of the FK-520 PKS. 

The FK-520 polyketide precursor produced by the action of the tenth extender 
module of the PKS is then attached to pipecolic acid and cyclized to form FK-520. The 

25 enzyme FkbP is the NRPS like enzyme that catalyzes these reactions. FkbP also includes 
a thioesterase activity that cleaves the nascent FK-520 polyketide from the NRPS. The 
present invention provides recombinant DNA compounds that encode the fkbP gene and 
so provides recombinant methods for expressing the fkbP gene product in recombinant 
host cells. The recombinant fkbP genes of the invention include those in which the 

30 coding sequence for the adenylation domain has been mutated or replaced with coding 
sequences from other NRPS like enzymes so that the resulting recombinant FkbP 
incorporates a moiety other than pipecolic acid. For the construction of host cells that do 
not naturally produce pipecolic acid, the present invention provides recombinant DNA 
compounds that express the enzymes that catalyze at least some of the biosynthesis of 

35 pipecolic acid (see Nielsen et al. y 1991, Biochem, 30: 5789-96). The JkbL gene encodes a 
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homolog of RapL, a lysine cyclodeaminase responsible in part for producing the 
pipecolate unit added to the end of the polyketide chain. The fkbB and fkbL recombinant 
genes of the invention can be used in heterologous hosts to produce compounds such as 
FK-520 or, in conjunction with other PKS or NRPS genes, to produce known or novel 
5 polyketides and non-ribosmal peptides. 

The present invention also provides recombinant DNA compounds that encode 
the P450 oxidase and methyltransferase genes involved in the biosynthesis of FK-520. 
Figure 2 shows the various sites on the FK-520 polyketide core structure at which these 
enzymes act. By providing these genes in recombinant form, the present invention 

10 provides recombinant host cells that can produce FK-520. This is accomplished by 
introducing the recombinant PKS, P450 oxidase, and methyltransferase genes into a 
heterologous host cell. In a preferred embodiment, the heterologous host cell is 
Streptomyces coelicolor CH999 or Streptomyces lividans K4-1 14, as described in U.S. 
Patent No. -5,830,750 and U.S. patent application Serial Nos. 08/828,898, filed 31 Mar. 

15 1997, and 09/181,833, filed 28 Oct. 1998, each of which is incorporated herein by 

reference. In addition, by providing recombinant host cells that express only a subset of 
these genes, the present invention provides methods for making FK-520 precursor 
compounds not readily obtainable by other means. 

In a related aspect, the present invention provides recombinant DNA compounds 

20 and vectors that are useful in generating, by homologous recombination, recombinant 
host cells that produce FK-520 precursor compounds. In this aspect of the invention, a 
native host cell that produces FK-520 is transformed with a vector (such as an SCP2* 
derived vector for Streptomyces host cells) that encodes one or more disrupted genes 
(i.e., a hydroxylase, a methyltransferase, or both) or merely flanking regions from those 

25 genes. When the vector integrates by homologous recombination, the native, functional 
gene is deleted or replaced by the non-functional recombinant gene, and the resulting 
host cell thus produces an FK-520 precursor. Such host cells can also be complemented 
by introduction of a modified form of the deleted or mutated non-functional gene to 
produce a novel compound. 

30 In one important embodiment, the present invention provides a hybrid PKS and 

the corresponding recombinant DNA compounds that encode those hybrid PKS 
enzymes. For purposes of the present invention a hybrid PKS is a recombinant PKS that 
comprises all or part of one or more modules and thioesterase/cyclase domain of a first 
PKS and all or part of one or more modules, loading module, and thioesterase/cyclase 
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domain of a second PKS. In one preferred embodiment, the first PKS is all or part of the 
FK-520 PKS, and the second PKS is only a portion or all of a non-FK-520 PKS. 

One example of the preferred embodiment is an FK-520 PKS in which the AT 
domain of module 8, which specifies a hydroxymalonyl CoA and from which the C- 13 
5 methoxy group of FK-520 is derived, is replaced by an AT domain that specifies a 
malonyl, methylmalonyl, or ethylmalonyl CoA. Examples of such replacement AT 
domains include the AT domains from modules 3, 12, and 13 of the rapaymycin PKS 
and from modules 1 and 2 of the erythromycin PKS. Such replacements, conducted at 
the level of the gene for the PKS, are illustrated in the examples below. Another 
10 illustrative example of such a hybrid PKS includes an FK-520 PKS in which the natural 
loading module has been replaced with a loading module of another PKS. Another 
example of such a hybrid PKS is an FK-520 PKS in which the AT domain of module 
three is replaced with an AT domain that binds methylmalonyl CoA. 

In another preferred embodiment, the first PKS is most but not all of a non-FK- 
1 5 520 PKS, and the second PKS is only a portion or all of the FK-520 PKS. An illustrative 
example of such a hybrid PKS includes an erythromycin PKS in which an AT specific 
for methylmalonyl CoA is replaced with an AT from the FK-520 PKS specfic for t 
malonyl CoA. $ 
Those of skill in the art will recognize that all or part of either the first or second 
20 PKS in a hybrid PKS of the invention need not be isolated from a naturally occurring 

source. For example, only a small portion of an AT domain determines its specificity. $ 
See U.S. provisional patent application Serial No. 60/091,526, incorporated herein by ; r 
reference. The state of the art in DNA synthesis allows the artisan to construct de novo 
DNA compounds of size sufficient to construct a useful portion of a PKS module or 
25 domain. For purposes of the present invention, such synthetic DNA compounds are 
deemed to be a portion of a PKS. 

Thus, the hybrid modules of the invention are incorporated into a PKS to provide 
a hybrid PKS of the invention. A hybrid PKS of the invention can result not only: 

(i) from fusions of heterologous domain (where heterologous means the domains 
30 in that module are from at least two different naturally occurring modules) coding 

sequences to produce a hybrid module coding sequence contained in a PKS gene whose 
product is incorporated into a PKS, 
but also: 

(ii) from fusions of heterologous module (where heterologous module means two 
35 modules are adjacent to one another that are not adjacent to one another in naturally 
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occurring PKS enzymes) coding sequences to produce a hybrid coding sequence 
contained in a PKS gene whose product is incorporated into a PKS, 

(iii) from expression of one or more FK-520 PKS genes with one or more non- 
FK-520 PKS genes, including both naturally occurring and recombinant non-FK-520 

5 PKS genes, and 

(iv) from combinations of the foregoing. 

Various hybrid PKSs of the invention illustrating these various alternatives are described 
herein. 

Examples of the production of a hybrid PKS by co-expression of PKS genes from 

10 the FK-520 PKS and another non-FK-520 PKS include hybrid PKS enzymes produced 
by coexpression of FK-520 and rapamycin PKS genes. Preferably, such hybrid PKS 
enzymes are produced in recombinant Streptomyces host cells that produce FK-520 or 
FK-506 but have been mutated to inactivate the gene whose function is to be replaced by 
the rapamycin PKS gene introduced to produce the hybrid PKS. Particular examples 

1 5 include (i) replacement of the fkbC gene with the rapB gene; and (ii) replacement of the 
fkbA gene with the rapC gene. The latter hybrid PKS produces 13,15-didesmethoxy-FK- 
520, if the host cell is an FK-520 producing host cell, and 13,15-didesmethoxy-FK-506, 
if the host cell is an FK-506 producing host cell The compounds produced by these 
hybrid PKS enzymes are immunosuppressants and neurotrophins but can be readily 

20 modified to act only as neurotrophins, as described in Example 6, below. 

Other illustrative hybrid PKS enzymes of the invention are prepared by replacing 
the fkbA gene of an FK-520 or FK-506 producing host cell with a hybrid fkbA gene in 
which: (a) the extender module 8 through 10, inclusive, coding sequences have been 
replaced by the coding sequnces for extender modules 12 to 14, inclusive, of the 

25 rapamycin PKS; and (b) the module 8 coding sequences have been replaced by the 
module 8 coding sequence of the rifamycin PKS. When expressed with the other, 
naturally occurring FK-520 or FK-506 PKS genes and the genes of the modification 
enzymes, the resulting hybrid PKS enzymes produce, respectively, (a) 13-desmethoxy- 
FK-520 or 13-desmethoxy-FK-506; and (b) 13-desmethoxy-13-methyl-FK-520 or 13- 

30 desmethoxy-13-methyl-FK-506. In a preferred embodiment, these recombinant PKS 
genes of the invention are introduced into the producing host cell by a vector such as 
pHU204, which is a plamsid pRM5 derivative that has the well-characterized SCP2* 
replicon, the colEl replicon, the tsr and bla resistance genes, and a cos site. This vector 
can be used to introduce the recombinant fkbA replacement gene in an FK-520 or FK- 

35 506 producing host cell (or a host cell derived therefrom in which the endogenous fkbA 
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gene has either been rendered inactive by mutation, deletion or homologous 
recombination with the gene that replaces it) to produce the desired hybrid PKS. 

In constructing hybrid PKSs of the invention, certain general methods may be 
helpful. For example, it is often beneficial to retain the framework of the module to be 
altered to make the hybrid PKS. Thus, if one desires to add DH and ER functionalities to 
a module, it is often preferred to replace the KR domain of the original module with a 
KR, DH, and ER domain-containing segment from another module, instead of merely 
inserting DH and ER domains. One can alter the stereochemical specificity of a module 
by replacement of the KS domain with a KS domain from a module that specifies a 
different stereochemistry. See Lau et al. y 1999, "Dissecting the role of acy [transferase 
domains of modular polyketide synthases in the choice and stereochemical fate of 
extender units," Biochemistry 38(5): 1643- 1651, incorporated herein by reference. 
Stereochemistry can also be changed by changing the KR domain. Also, one can alter the 
specificity of an AT domain by changing only a small segment of the domain. See Lau et 
aL, supra. One can also take advantage of known linker regions in PKS proteins to link 
modules from two different PKSs to create a hybrid PKS. See Gokhale et aL, 16 Apr. 
1999, "Dissecting and Exploiting Intermodular Communication in Polyketide 
Synthases," Science 284: 482-485, incorporated herein by reference. 

The following Table lists references describing illustrative PKS genes and 
corresponding enzymes that can be utilized in the construction of the recombinant PKSs 
and the corresponding DNA compounds that encode them of the invention. Also 
presented are various references describing tailoring enzymes and corresponding genes 
that can be employed in accordance with the methods of the present invention? 
A verm ec tin 

U.S. Pat. No. 5,252,474 to Merck. 

MacNeil et al. y 1993, Industrial Microorganisms: Basic and Applied Molecular 
Genetics , Baltz, Hegeman, & Skatrud, eds. (ASM), pp. 245-256, A Comparison of the 
Genes Encoding the Polyketide Synthases for Avermectin, Erythromycin, and 
Nemadectin. 

MacNeil etaL, 1992, Gene 115: 119-125, Complex Organization of the 
Streptomyces avermitilis genes encoding the avermectin polyketide synthase. 

Ikeda et a/., Aug. 1999, Organization of the biosynthetic gene cluster for the 
polyketide anthelmintic macrolide avermectin in Streptomyces avermitilis, Proc, Natl. 
Acad. ScL USA 96: 9509-9514. 
Candicidin (FR008) 
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Huet at., 1994, Mol Microbiol 14: 163-172. 
Epothilone 

U.S. Pat. App. Serial No. 60/130,560, filed 22 April 1999. 
Erythromycin 
5 PCT Pub. No. 93/1 3663 to Abbott. 

US Pat. No. 5,824,513 to Abbott. 
Donadio et al. y 1991, Science 252:675-9. 

Cortes et al. 9 8 Nov. 1990, Nature J4<S: 176-8, An unusually large 
multifunctional polypeptide in the erythromycin producing polyketide synthase of 
1 0 Saccharopolyspora erythraea . 

Glycosylation Enzymes 

PCT Pat. App. Pub. No. 97/23630 to Abbott. 
FK-506 

Motamedi et aL, 1998, The biosynthetic gene cluster for the macrolactone ring of 
15 the immunosuppressant FK-506, Eur, J. biochem. 256: 528-534. 

Motamedi et al. 9 1997, Structural organization of a multifunctional polyketide 
synthase involved in the biosynthesis of the macrolide immunosuppressant FK-506, Eur. 
J. Biochem. 244: 74-80. 

Methyltransferase 

20 US 5,264,355, issued 23 Nov. 1993, Methylating enzyme from 

Streptomyces MA6858. 31-O-desmethyl-FK-506 methyltransferase. 

Motamedi et aL y 1996, Characterization of methyltransferase and 
hydroxylase genes involved in the biosynthesis of the immunosuppressants FK-506 and 
FK-520,7. Bacterioi 178: 5243-5248. 
25 Streptomyces hygroscopicus 

U.S. patent application Serial No. 09/154,083, filed 16 Sep. 1998. 
Lovastatin 

U.S. Pat. No. 5,744,350 to Merck. 
Narbomycin 

30 U.S. patent application Serial No. 60/107,093, filed 5 Nov. 1998, and Serial No. 

60/120,254, filed 16 Feb. 1999. 
Nemadectin 

MacNeil et ai, 1993, supra. 
Niddamycin 
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Kakavas et a/., 1997, Identification and characterization of the niddamycin 
polyketide synthase genes from Streptomyces caelestis,J. BacterioL 179: 7515-7522. 
Oleandomycin 

Swan et al. y 1994, Characterisation of a Streptomyces antibioticus gene encoding 
5 a type I polyketide synthase which has an unusual coding sequence, Mol. Gen, Genet. 
242: 358-362. 

U.S. patent application Serial No. 60/120,254, filed 16 Feb. 1999. 
Olano et aL, 1998, Analysis of z Streptomyces antibioticus chromosomal region 
involved in oleandomycin biosynthesis, which encodes two glycosyltransferases 
10 responsible for glycosylation of the macrolactone ring, Mol. Gen. Genet. 259(3): 299- 
308. 

Picromycin 

PCT patent application US99/15047, filed 2 Jul. 1999. 
Xue et al., 1998, Hydroxy lation of macrolactones YC-17 and narbomycin is 
15 mediated by the /?/£C-encoded cytochrome P450 in Streptomyces venezuelae, Chemistry 
& Biology 5(11): 661-667. 

Xue et aL % Oct. 1998, A gene cluster for macrolide antibiotic biosynthesis in 
Streptomyces venezuelae*. Architecture of metabolic diversity, Proc. Natl. Acad Sci. 
USA 95: 12111 12116. 
20 Platenolide 

EP Pat. App. Pub. No. 791,656 to Lilly. 
Rapamycin 

Schwecke et a/., Aug. 1995, The biosynthetic gene cluster for the 
polyketide rapamycin, Proc. Natl. Acad. Sci. USA 92:7839-7843. 
25 Aparicio et aL, 1996, Organization of the biosynthetic gene cluster for rapamycin 

in Streptomyces hygroscopicus: analysis of the enzymatic domains in the modular 
polyketide synthase, Gene 169: 9-16. 
Rifamycin 

August et a/., 13 Feb. 1998, Biosynthesis of the ansamycin antibiotic rifamycin: 
30 deductions from the molecular analysis of the rz/biosynthetic gene cluster of 
Amycolatopsis mediterranei S669, Chemistry & Biology*, 5(2): 69-79. 
Sorangium PKS 

U.S. patent application Serial No. 09/144,085, filed 31 Aug. 1998. 
Soraphen 

35 U.S. Pat. No. 5,716,849 to Novartis. 
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Schupp et aL, 1995, J. Bacteriology 177: 3673-3679. A Sorangium cellulosum 
(Mycobacterium) Gene Cluster for the Biosynthesis of the Macrolide Antibiotic 
Soraphen A: Cloning, Characterization, and Homology to Polyketide Synthase Genes 
from Actinomycetes. 
5 Spiramycin 

U.S. Pat, No. 5,098,837 to Lilly. 

Activator Gene 

U.S. Pat. No. 5,514,544 to Lilly. 
Tylosin 

10 EP Pub. No. 791,655 to Lilly. 

U.S. Pat. No. 5,876,991 to Lilly. 

Kuhstoss et al^ 1996, Gene 753:231-6., Production of a novel polyketide 
through the construction of a hybrid polyketide synthase. 
Tailoring enzymes 

15 Merson-Davies and Cundliffe, 1994, MoL Microbiol. 13: 349-355. Analysis of 

five tylosin biosynthetic genes from the tylBA region of the Streptomyces fradiae 
genome. 

As the above Table illustrates, there are a wide variety of polyketide synthase 
genes that serve as readily available sources of DNA and sequence information for use in 

20 constructing the hybrid PKS-encoding DNA compounds of the invention. Methods for 
constructing hybrid PKS-encoding DNA compounds are described without reference to 
the FK-520 PKS in PCT patent publication No. 98/51695; U.S. Patent Nos. 5,672,491 
and 5,712,146 and U.S. patent application Serial Nos. 09/073,538, filed 6 May 1998, and 
09/141,908, filed 28 Aug 1998, each of which is incorporated herein by reference. 

25 The hybrid PKS-encoding DNA compounds of the invention can be and often are 

hybrids of more than two PKS genes. Moreover, there are often two or more modules in 
the hybrid PKS in which all or part of the module is derived from a second (or third) 
PKS. Thus, as one illustrative example, the present invention provides a hybrid FK-520 
PKS that contains the naturally occurring loading module and FkbP as well as modules 

30 one, two, four, six, seven, and eight, nine, and ten of the FK-520 PKS and further 

contains hybrid or heterologous modules three and five. Hybrid or heterologous module 
three contains an AT domain that is specific of methylmalonyl CoA and can be derived 
for example, from the erythromycin or rapamycin PKS genes. Hybrid or heterologous 
module five contains an AT domain that is specific for malonyl CoA and can be derived 

35 for example, from the picromycin or rapamycin PKS genes. 
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While an important embodiment of the present invention relates to hybrid PKS 
enzymes and corresponding genes, the present invention also provides recombinant FK- 
520 PKS genes in which there is no second PKS gene sequence present but which differ 
from the FK-520 PKS gene by one or more deletions. The deletions can encompass one 
5 or more modules and/or can be limited to a partial deletion within one or more modules. 
When a deletion encompasses an entire module, the resulting FK-520 derivative is at 
least two carbons shorter than the gene from which it was derived. When a deletion is 
within a module, the deletion typically encompasses a KR, DH, or ER domain, or both 
DH and ER domains, or both KR and DH domains, or all three KR, DH, and ER 
10 domains. 

To construct a hybrid PKS or FK-520 derivative PKS gene of the invention, one 
can employ a technique, described in PCT Pub. No. 98/27203 and U.S. patent 
application Serial No. 08/989,332, filed 1 1 Dec. 1997, each of which is incorporated 
herein by reference, in which the large PKS gene is divided into two or more, typically 

1 5 three, segments, and each segment is placed on a separate expression vector. In this 
manner, each of the segments of the gene can be altered, and various altered segments 
can be combined in a single host cell to provide a recombinant PKS gene of the 
invention. This technique makes more efficient the construction of large libraries of 
recombinant PKS genes, vectors for expressing those genes, and host cells comprising 

20 those vectors. 

Thus, in one important embodiment, the recombinant DNA compounds of the 
invention are expression vectors. As used herein, the term expression vector refers to any 
nucleic acid that can be introduced into a host cell or cell-free transcription and 
translation medium. An expression vector can be maintained stably or transiently in a 

25 cell, whether as part of the chromosomal or other DNA in the cell or in any cellular 
compartment, such as a replicating vector in the cytoplasm. An expression vector also 
comprises a gene that serves to produce RNA that is translated into a polypeptide in the 
cell or cell extract. Furthermore, expression vectors typically contain additional 
functional elements, such as resistance-conferring genes to act as selectable markers. 

30 The various components of an expression vector can vary widely, depending on 

the intended use of the vector. In particular, the components depend on the host cell(s) in 
which the vector will be used or is intended to function. Vector components for 
expression and maintenance of vectors in E. coli are widely known and commercially 
available, as are vector components for other commonly used organisms, such as ye^t 

35 cells and Streptomyces cells. 
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In a preferred embodiment, the expression vectors of the invention are used to 
construct recombinant Streptomyces host cells that express a recombinant PKS of the 
invention. Preferred Streptomyces host cell/vector combinations of the invention include 
S. coelicolor CH999 and S. lividans K4-1 14 host cells, which do not produce 
5 actinorhodin, and expression vectors derived from the pRM 1 and pRM5 vectors, as 
described in U.S. Patent No. 5,830,750 and U.S. patent application Serial Nos. 
08/828,898, filed 31 Mar. 1997, and 09/181,833, filed 28 Oct. 1998, each of which is 
incorporated herein by reference. 

The present invention provides a wide variety of expression vectors for use in 

10 Streptomyces. For replicating vectors, the origin of replication can be, for example and 
without limitation, a low copy number vector, such as SCP2* (see Hopwood et aL, 
Genetic Manipulation of Streptomyces: A Laboratory manual (The John Innes 
Foundation, Norwich, U.K., 1985); Lydiate et aL, 1985, Gene 35: 223-235; and Kieser 
and Melton, 1988, Gene 65: 83-91, each of which is incorporated herein by reference), 

15 SLP 1 .2 (Thompson et aL, 1 982, Gene 20: 5 1 -62, incorporated herein by reference), and 
SG5(ts) (Muth et a/., 1989, Mol. Gen, Genet 219: 341-348, and Bierman et a/., 1992, 
Gene 116: 43-49, each of which is incorporated herein by reference), or a high copy 
number vector, such as pi Jl 01 and pJVl (see Katz et aL n 1983, 7. Gen. Microbiol 129: 
2703-2714; Vara et al., 1989, J. Bacteriol 171: 5782-5781; and Servin-Gonzalez, 1993, 

20 Plasmid 30: 131-140, each of which is incorporated herein by reference). Generally, 

however, high copy number vectors are not preferred for expression of genes contained 
on large segments of DNA. For non-replicating and integrating vectors, it is useful to 
include at least an E. coli origin of replication, such as from pUC, plP, pi I, and pBR. 
For phage based vectors, the phages phiC31 and KC515 can be employed (see Hopwood 

25 et a/., supra). 

Typically, the expression vector will comprise one or more marker genes by 
which host cells containing the vector can be identified and/or selected. Useful antibiotic 
resistance conferring genes for use in Streptomyces host cells include the ermE (confers 
resistance to erythromycin and other macrolides and lincomycin), tsr (confers resistance 

30 to thiostrepton), aadA (confers resistance to spectinomycin and streptomycin), aacC4 
(confers resistance to apramycin, kanamycin, gentamicin, geneticin (G418), and 
neomycin), hyg (confers resistance to hygromycin), and vph (confers resistance to 
viomycin) resistance conferring genes. 

The recombinant PKS gene on the vector will be under the control of a promoter, 

35 typically with an attendant ribosome binding site sequence. The present invention 
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provides the endogenous promoters of the FK-520 PKS and related biosynthetic genes in 
recombinant form, and these promoters are preferred for use in the native hosts and in 
heterologous hosts in which the promoters function. A preferred promoter of the 
invention is the JkbO gene promoter, comprised in a sequence of about 270 bp between 
the start of the open reading frames of the fkbO and JkbB gene?. The fkbO promoter is 
believed to be bi-directional in that it promotes transcription of the genes JkbO.fkbP, and 
fkbA in one direction and fkbB.fkbC, and JkbL in the other. Thus, in one aspect, the 
present invention provides a recombinant expression vector comprising the promoter of 
the JkbO gene of an FK-520 producing organism positioned to transcribe a gene other 
than JkbO. In a preferred embodiment the transcribed gene is an FK-520 PKS gene. In 
another preferred embodiment, the transcribed gene is a gene that encodes a protein 
comprised in a hybrid PKS. 

Heterologous promoters can also be employed and are preferred for use in host 
cells in which the endogenous FK-520 PKS gene promoters do not function or function 
15 poorly. A preferred heterologous promoter is the actl promoter and its attendant activator 
gene actIl-ORF4 s which is provided in the pRMl and pRM5 expression vectors, supra. 
This promoter is activated in the stationary phase of growth when secondary metabolites 
are normally synthesized. Other useful Streptomyces promoters include without 
limitation those from the ermE gene and the melCl gene, which act constitutively, and 
20 the tipA gene and the merA gene, which can be induced at any growth stage. In addition, 
the T7 RNA polymerase system has been transferred to Streptomyces and can be 
employed in the vectors and host cells of the invention. In this system, the coding 
sequence for the T7 RNA polymerase is inserted into a neutral site of the chromosome or 
in a vector under the control of the inducible merA promoter, and the gene of interest is 
25 placed under the control of the T7 promoter. As noted above, one or more activator 
genes can also be employed to enhance the activity of a promoter. Activator genes in 
addition to the actII-ORF4 gene discussed above include dnrl, redD, and ptpA genes (see 
U.S. patent application Serial No. 09/181,833, supra) to activate promoters under their 
control. 

30 In addition to providing recombinant DNA compounds that encode the FK-520 

PKS, the present invention also provides DNA compounds that encode the ethylmalonyl 
CoA and 2-hydroxymalonyl CoA utilized in the synthesis of FK-520. Thus, the present 
invention also provides recombinant host cells that express the genes required for the 
biosynthesis of ethylmalonyl CoA and 2-hydroxymalonyl CoA. Figures 3 and 4 show the 
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location of these genes on the cosmids of the invention and the biosynthetic pathway that 
produces ethylmalonyl CoA. 

For 2-hydroxymalonyl CoA biosynthesis, the JkbH,JkbI y fkbJ, and JkbK genes are 
sufficient to confer this ability on Streptomcyces host cells. For conversion of 2- 
5 hydroxymalonyl to 2-methoxymalonyl, the fkbG gene is also employed. While the 
complete coding sequence for JkbH is provided on the cosmids of the invention, the 
sequence for this gene provided herein may be missing a T residue, based on a 
comparison made with a similar gene cloned from the ansamitocin gene cluster by Dr. H. 
Floss. Where the sequence herein shows one T, there may be two, resulting in an 

1 0 extension of the fkbH reading frame to encode the amino acid sequence: 

MTIVKCLVWDLDNTLWRGTVLEDDEVVLTDEIREVITTLDDRGILQAVASKNDH 
DLAWERLERJ-GVAEYFVLARIGWGPKSQSVREIATELNFAPTT 
EVAFHLPEVRCYPAEQAATLLSLPEFSPPVSTVDSRRRRLMYQAGFARDQAREA 
YSGPDEDFLRSLDLSMTIAPAGEEELSRVEELTLRTSQMNATGVHYSDADLRAL 

15 LTDPAHEVLVVTMGDRFGPHGAVGIILLEKKPSTWHLKLLATSCRVVSFGAGAT 
IL>TWLTDQGARAGAHLVADFRRTDRNRMMEIAYRFAGFADSDC 
AAGVERLHLEPSARPAPTTLTLTAADIAPVTVSAAG. 

For ethylmalonyl CoA biosynthesis, one requires only a crotonyl CoA reductase, 
which can be supplied by the host cell but can also be supplied by recombinant 

20 expression of the JkbS gene of the present invention. To increase yield of ethylmalonyl 
CoA, one can also express the fkbE and fkbU genes as well. While such production can 
be achieved using only the recombinant genes above, one can also achieve such 
production by placing into the recombinant host cell a large segment of the DNA 
provided by the cosmids of the invention. Thus, for 2-hydroxymalonyl and 2- 

25 methoxymalonyl CoA biosynthesis, one can simply provide the cells with the segment of 
DNA located on the left side of the FK-520 PKS genes shown in Figure 1. For 
ethylmalonyl CoA biosynthesis, one can simply provide the cells with the segment of 
DNA located on the right side of the FK-520 PKS genes shown in Figure 1 or, 
alternatively, both the right and left segments of DNA. 

30 The recombinant DNA expression vectors that encode these genes can be used to 

construct recombinant host cells that can make these important polyketide building 
blocks from cells that otherwise are unable to produce them. For example, Streptomyces 
coelicolor and Streptomyces lividans do not synthesisze ethylmalonyl CoA or 2- 
hydroxymalonyl CoA. The invention provides methods and vectors for constructing 

35 recombinant Streptomyces coelicolor and Streptomyces lividans that are able to 
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synthesize either or both ethylmalonyl CoA and 2-hydroxymalonyl CoA. These host 
cells are thus able to make polyketides, those requiring these substrates, that cannot 
otherwise be made in such cells. 

In a preferred embodiment, the present invention provides recombinant 
5 Streptomyces host cells, such as S. coelicolor and S. lividans y that have been transformed 
with a recombinant vector of the invention that codes for the expression of the 
ethylmalonyl CoA biosynthetic genes. The resulting host cells produce ethylmalonyl 
CoA and so are preferred host cells for the production of polyketides produced by PKS 
enzymes that comprise one or more AT domains specific for ethylmalonyl CoA. 

10 Illustrative PKS enzymes of this type include the FK-520 PKS and a recombinant PKS in 
which one or more AT domains is specific for ethylmalonyl CoA. 

In a related embodiment, the present invention provides Streptomyces host cells 
in which one or more of the ethylmalonyl or 2-hydroxymalonyl biosynthetic genes have 
been deleted by homologous recombination or rendered inactive by mutation. For 

15 example, deletion or inactivation of the JkbC gene can prevent formation of the methoxyl 
groups at C-13 and C- 15 of FK-520 (or, in the corresponding FK-506 producing cell, 
FK-506), leading to the production of 13,1 5-didesmethoxy- 13,1 5-dihydroxy-FK-520 
(or, in the corresponding FK-506 producing cell, 13,1 5-didesmethoxy- 1 3, 15-dihydroxy- 
FK-506). If the JkbG gene product acts on 2-hydroxymalonyl and the resulting 2- 

20 methoxymalonyl substrate is required for incorporation by the PKS, the AT domains of 
modules 7 and 8 may bind malonyl CoA and methylmalonyl CoA. Such incorporation 
results in the production of a mixture of polyketides in which the methoxy groups at C- 
1 3 and C-l 5 of FK-520 (or FK-506) are replaced by either hydrogen or methyl. 

This possibility of non-specific binding results from the construction of a hybrid 

25 PKS of the invention in which the AT domain of module 8 of the FK-520 PKS replaced 
the AT domain of module 6 of DEBS. The resulting PKS produced, in Streptomyces 
lividans, 6-dEB and 2-desmethyl-6-dEB, indicating that the AT domain of module 8 of 
the FK-520 PKS could bind malonyl CoA and methylmalonyl CoA substrates. Thus, one 
could possibly also prepare the 13,15-didesmethoxy-FK-520 and corresponding FK-506 

30 compounds of the invention by deleting or otherwise inactivating one or more or all of 
the genes required for 2-hydroxymalonyl CoA biosynthesis, i.e., \YisjkbH,fkbI,jkbJ. and 
fkbK genes. In any event, the deletion or inactivation of one or more biosynthetic genes 
required for ethylmalonyl and/or 2-hydroxymalonyl production prevents the formation of 
polyketides requiring ethylmalonyl and/or 2-hydroxymalonyl for biosynthesis, and the 
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resulting host cells are thus preferred for production of polyketides that do not require 
the same. 

The host cells of the invention can be grown and fermented under conditions 
known in the art for other purposes to produce the compounds of the invention. See, e.g., 
5 U.S. Patent Nos. 5,194,378; 5,1 16,756; and 5,494,820, incorporated herein by reference, 
for suitable fermentation processes. The compounds of the invention can be isolated 
from the fermentation broths of these cultured cells and purified by standard procedures. 
Preferred compounds of the invention include the following compounds: 13- 
desmethoxy-FK-506; 13-desmethoxy-FK-520; 13,15-didesmethoxy-FK-506; 13,15- 

10 didesmethoxy-FK-520; l3-desmethoxy-18-hydroxy-FK-506; 13-desmethoxy-18- 

hydroxy-FK-520; 13,15-didesmethoxy-18-hydroxy-FK-506; and 13,15-didesmethoxy- 
18-hydroxy-FK-520. These compounds can be further modified as described for 
tacrolimus and FK-520 in U.S. Patent Nos. 5,225,403; 5,189,042; 5,164,495; 5,068,323; 
4,980,466; and 4,920,218, incorporated herein by reference. 

1 5 Other compounds of the invention are shown in Figure 8, Parts A and B. In 

Figure 8, Part A, illustrative C-32-substituted compounds of the invention are shown in 
two columns under the heading R. The substituted compounds are preferred for topical 
administration and are applied to the dermis for treatment of conditions such as psoriasis. 
In Figure 8, Part B, illustrative reaction schemes for making the compounds shown in 

20 Figure 8, Part A, are provided. In the upper scheme in Figure 8, Part B, the C-32 

substitution is a tetrazole moiety, illustrative of the groups shown in the left column 
under R in Figure 8, Part A. In the lower scheme in Figure 8, Part B, the C-32 
substitution is a disubstituted amino group, where R3 and R4 can be any group similar to 
the illustrative groups shown attached to the amine in the right column under R in Figure 

25 8, Part A. While Figure 8 shows the C-32-substituted compounds in which the C-15- 
methoxy is present, the invention includes these C-32-substituted compounds in which 
C-15 is ethyl, methyl, or hydrogen. Also, while C-21 is shown as substituted with ethyl 
or allyl, the compounds of the invention includes the C-32-substituted compounds in 
which C-2 1 is substituted with hydrogen or methyl. 

30 To make these C-32-substituted compounds, Figure 8, Part B, provides 

illustrative reaction schemes. Thus, a selective reaction of the starting compound (see 
Figure 8, Part B, for an illustrative starting compound) with trifluoromethanesulfonic 
anhydride in the presence of a base yields the C-32 O-triflate derivative, as shown in the 
upper scheme of Figure 8, Part B. Displacement of the triflate with lH-tetrazole or 

35 triazole derivatives provides the C-32 tetrazole or teiazole derivative. As shown in the 
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lower scheme of Figure 8, Part B, reacting the starting compound with p- 
nitrophenylchloroformate yields the correspoinding carbonate, which, upon displacement 
with an amino compound, provides the corresponding carbamate derivative. 

The compounds can be readily formulated to provide the pharmaceutical 
5 compositions of the invention. The pharmaceutical compositions of the invention can be 
used in the form of a pharmaceutical preparation, for example, in solid, semisolid, or 
liquid form. This preparation contains one or more of the compounds of the invention as 
an active ingredient in admixture with an organic or inorganic carrier or excipient 
suitable for external, enteral, or parenteral application. The active ingredient may be 
10 compounded, for example, with the usual non-toxic, pharmaceutically acceptable carriers 
for tablets, pellets, capsules, suppositories, solutions, emulsions, suspensions, and any 
other form suitable for use. Suitable formulation processes and compositions for the 
compounds of the present invention are described with respect to tacrolimus in U.S. 
Patent Nos. 5,939,427; 5,922,729; 5,385,907; 5,338,684; and 5,260,301, incorporated 
1 5 herein by reference. Many of the compounds of the invention contain one or more chiral 
centers, and all of the stereoisomers are included within the scope of the invention, ls 
pure compounds as well as mixtures of stereoisomers. Thus the compounds of the 
invention may be supplied as a mixture of stereoisomers in any proportion. 

The carriers which can be used include water, glucose, lactose, gum acacia, 
20 gelatin, mannitol, starch paste, magnesium trisilicate, talc, com starch, keratin, colloidal 
silica, potato starch, urea, and other carriers suitable for use in manufacturing 
preparations, in solid, semi-solid, or liquified form. In addition, auxiliary stabilizing, 
thickening, and coloring agents and perfumes may be used. For example, the compounds 
of the invention may be utilized with hydroxypropyl methylcellulose essentially as 
25 described in U.S. Patent No. 4,916,138, incorporated herein by reference, or with a 

surfactant essentially as described in EPO patent publication No. 428,169, incorporated 
herein by reference. 

Oral dosage forms may be prepared essentially as described by Hondo et a/., 
1987, Transplantation Proceedings XIX, Supp. 6: 17-22, incorporated herein by 
30 reference. Dosage forms for external application may be prepared essentially as 

described in EPO patent publication No. 423,714, incorporated herein by reference. The 
active compound is included in the pharmaceutical composition in an amount sufficient 
to produce the desired effect upon the disease process or condition. 

For the treatment of conditions and diseases relating to immunosuppression or 
35 neuronal damage, a compound of the invention may be administered orally, topically, 
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parenterally, by inhalation spray, or rectally in dosage unit formulations containing 
conventional non-toxic pharmaceutically acceptable carriers, adjuvant, and vehicles. The 
term parenteral, as used herein, includes subcutaneous injections, and intravenous, 
intramuscular, and intrasternal injection or infusion techniques. 
5 Dosage levels of the compounds of the present invention are of the order from 

about 0.01 mg to about 50 mg per kilogram of body weight per day, preferably from 
about 0.1 mg to about 10 mg per kilogram of body weight per day. The dosage levels are 
useful in the treatment of the above-indicated conditions (from about 0.7 mg to about 3.5 
mg per patient per day, assuming a 70 kg patient). In addition, the compounds of the 
10 present invention may be administered on an intermittent basis, i.e., at semi-weekly, 
weekly, semi-monthly, or monthly intervals. 

The amount of active ingredient that may be combined with the carrier materials 
to produce a single dosage form will vary depending upon the host treated and the 
particular mode of administration. For example, a formulation intended for oral 
1 5 administration to humans may contain from 0.5 mg to 5 g of active agent compounded 
with an appropriate and convenient amount of carrier material, which may vary from 
about 5 percent to about 95 percent of the total composition. Dosage unit forms will 
generally contain from about 0.5 mg to about 500 mg of active ingredient. For external 
administration, the compounds of the invention can be formulated within the range of, 
20 for example, 0.00001% to 60% by weight, preferably from 0.001% to 10% by weight, 
and most preferably from about 0.005% to 0.8% by weight. The compounds and 
compositions of the invention are useful in treating disease conditions using doses and 
administration schedules as described for tacrolimus in U.S. Patent Nos. 5,542,436; 
5,365,948; 5,348,966; and 5,196,437, incorporated herein by reference. The compounds 
25 of the invention can be used as single therapeutic agents or in combination with other 
therapeutic agents. Drugs that can be usefully combined with compounds of the 
invention include one or more immunosuppressant agents such as rapamycin, 
cyclosporin A, FK-506, or one or more neurotrophic agents. 

It will be understood, however, that the specific dosage level for any particular 
30 patient will depend on a variety of factors. These factors include the activity of the 

specific compound employed; the age, body weight, general health, sex, and diet of the 
subject; the time and route of administration and the rate of excretion of the drug; 
whether a drug combination is employed in the treatment; and the severity of the 
particular disease or condition for which therapy is sought. 
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A detailed description of the invention having been provided above, the 
following examples are given for the purpose of illustrating the present invention and 
shall not be construed as being a limitation on the scope of the invention or claims. 



10 



Example 1 

Replacement of Methoxyl with Hydrogen or Methyl at C-13 of FK-520 
The C- 1 3 methoxyl group is introduced into FK-520 via an AT domain in 
extender module 8 of the PKS that is specific for hydroxymalonyl and by methylation of 
the hydroxyl group by an S-adenosyl methionine (SAM) dependent methyltransferase. 
Metabolism of FK-506 and FK-520 primarily involves oxidation at the C-13 position 
into an inactive derivative that is further degraded by host P450 and other enzymes. The 
present invention provides compounds related in structure to FK-506 and FK-520 that do 
not contain the C-13 methoxy group and exhibit greater stability and a longer half-life in 
vivo. These compounds are useful medicaments due to their immunosuppressive and 
1 5 neurotrophic activities, and the invention provides the compounds in purified form and 
as pharmaceutical compositions. 

The present invention also provides the novel PKS enzymes that produce these 
novel compounds as well as the expression vectors and host cells that produce the novel 
PKS enzymes. The novel PKS enzymes include, among others, those that contain an AT 
20 domain specific for either malonyl CoA or methylmalonyl CoA in module 8 of the FK- 
506 and FK-520 PKS. This example describes the construction of recombinant DNA 
compounds that encode the novel FK-520 PKS enzymes and the transformation of host 
cells with those recombinant DNA compounds to produce the novel PKS enzymes and 
the polyketides produced thereby. 
25 To construct an expression cassette for performing module 8 AT domain 

replacements in the FK-520 PKS, a 4.6 kb Sphl fragment from the FK-520 gene cluster 
was cloned into plasmid pLitmus 38 (a cloning vector available from New England 
Biolabs). The 4.6 kbSphl fragment, which encodes the ACP domain of module 7 
followed by module 8 through the KR domain, was isolated from an agarose gel after 
30 digesting the cosmid P KOS65-C3 1 with Sph I. The clone having the insert oriented so 
the single Sad site was nearest to the Spel end of the polylinker was identified and 
designated as plasmid pKOS60-21-67. To generate appropriate cloning sites, two linkers 
were ligated sequentially as follows. First, a linker was ligated between the Spel and 
Sad sites to introduce a BgUL site at the 5' end of the cassette, to eliminate interfering 
35 polylinker sites, and to reduce the total insert size to 4.5 kb (the limit of the phage 
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K.C515). The ligation reactions contained 5 picomolar unphosphorylated linker DNA and 
0.1 picomolar vector DNA, i.e., a 50-fold molar excess of linker to vector. The linker had 
the following sequence: 

5 '-CTAGTGGGC AGATCTGGCAGCT-3 ' 
5 3'-ACCCGTCTAGACCG-5' 

The resulting plasmid was designated pKOS60-27- 1 . 

Next, a linker of the following sequence was ligated between the unique Sphl and 

A/IU. sites of plasmid pKOS60-27-l to introduce an Afcfl site at the 3* end of the module 

8 cassette. The linker employed was: 

10 5 ' -GGG ATGC ATGGC-3 ' 

3'-GTACCCCTACGTACCGAATT-5' 

The resulting plasmid was designated pKOS60-29-55. 

To allow in-frame insertions of alternative AT domains, sites were engineered at 
the 5' end {Ayr II or Nhe I) and 3* end {Xho I) of the AT domain using the polymerase 
1 5 chain reaction (PCR) as follows. Plasmid pKOS60-29-55 was used as a template for the 
PCR and sequence 5' to the AT domain was amplified with the primers SpeBgl-fwd and 
either Avr-rev or Nhe-rev: 

SpeBgl-fwd 5 ' -CG ACTC ACT AGTGGGC AG ATCTGG-3 ' 
Avr-rev 5 '-CACGCCTAGGCCGGTCGGTCTCGGGCCAC-3' 
20 Nhe-rev 5 * -GCGGCT AGCTGCTCGCCC ATCGCGGG ATGC-3 * 

The PCR included, in a 50 ul reaction, 5 \i\ of lOx Pfu polymerase buffer 
(Stratagene), 5 ul lOx z-dNTP mixture (2 mM dATP, 2 mM dCTP, 2 mM dTTP, 1 mM 
dGTP, 1 mM 7-deaza-GTP), 5 ul DMSO, 2 ul of each primer (10 uM), 1 ul of template 
DNA (0.1 ug/ul), and 1 ul of cloned Pfu polymerase (Stratagene). The PCR conditions 
25 were 95°C for 2 min., 25 cycles at 95°C for 30 sec, 60°C for 30 sec, and 72°C for 4 
min., followed by 4 min. at 72°C and a hold at 0°C. The amplified DNA products and 
the Litmus vectors were cut with the appropriate restriction enzymes (Sg/II and y4vrll or 
Spel and Nhel), and cloned into either pLitmus 28 or pLitmus38 (New England Biohbs), 
respectively, to generate the constructs designated pKOS60-37-4 and pKOS60-37-2, 

30 respectively. 

Plasmid pKOS60-29-55 was again used as a template for PCR to amplify 
sequence 3' to the AT domain using the primers BsrXho-fwd and NsiAfl-rev: 

BsrXho-fwd 5 '-GATGTAC AGCTCGAGTCGGCACGCCCGGCCGGATC-3 ' 
NsiAfl-rev 5'-CGACTCACTTAAGCCATGCATCC-3' 
35 PCR conditions were as described above. The PCR fragment was cut with BsrQl 

and 4/7II, gel isolated, and ligated into pKOS60-37-4 cut with ^sp718 and AflO. and 
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inserted into pKOS60-37-2 cut with BsrGl and AflO, to give the plasmids pKOS60-39-l 
and pKOS60-39-13, respectively. These two plasmids can be digested with Avrll and 
Xhol or Nhel and Xhol, respectively, to insert heterologous AT domains specific for 
malonyl, methylmalonyl, ethylmalonyl, or other extender units. 
5 Malonyl and methylmalonyl-specific AT domains were cloned from the 

rapamycin cluster using PCR amplification with a pair of primers that introduce an Avrll 
or Nhel site at the 5' end and an Xhol site at the 3' end. The PCR conditions were as 
given above and the primer sequences were as follows: 

10 RATN1 5 ' - ATCCT AGGCGGGCRGG YGTGTCGTCCTTCGG-3 ' 

f3' end of Rap KS sequence and universal for malonyl and methylmalonyl CoA), 
RATMN2 5'-ATGCTAGCCGCCGCGTTCCCCGTCTTCGCGCG-3' 
(Rap AT shorter version 5*- sequence and specific for malonyl CoA), 
RATMMN2 5 ' - ATGCT AGCGG ATTCGTCGGTGGTGTTCGCCGA-3 ' 

15 (Rap AT shorter version 5'- sequence and specific for methylmalonyl CoA), and 
RATC 5'-ATCTCGAGCCAGTASCGCTGGTGYTGGAAGG-3' 
(Rap DH 5-'- sequence and universal for malonyl and methylmalonyl CoA). 
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5 



MMN2 - Nhel 
Nl -AvrUj MNl^Nhel 
1~KS 1 AT 1 I OH | ^ Rap Module 



r>H _ 

^Xhol - C 

1 0 Because of the high sequence similarity in each module of the rapamycin cluster , 

each primer was expected to prime any of the AT domains. PCR products representing 
ATs specific for malonyl or methylmalonyl extenders were identified by sequencing 
individual cloned PCR products. Sequencing also confirmed that the chosen clones 
contained no cloning artifacts. Examples of hybrid modules with the rapamycin AT 12 

1 5 and ATI 3 domains are shown in a separate figure. 

The Avrll-Xhol restriction fragment that encodes module 8 of the FK-520 PKS 
with the endogenous AT domain replaced by the AT domain of module 12 of the 
rapamycin PKS has the DNA sequence and encodes the amino acid sequence shown 
below. The AT of rap module 12 is specific for incorporation of malonyl units. 

20 AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 50 
IWQLAEALLTLVREST 
GCCGCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 

AAVLGHVGGEDI PATAA 
GTTCAAGG ACCTCGGC ATCG ACTCGCTC ACCGCGGTCC AGCTGCGCAACG 150 
25 FKDLG I DSLTAVQLRN 

CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 200 
ALTEATGVRLNATAVFD 
TTCeCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 
FPT PHVLAGK LG DELTG 
30 CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 300 
TRAPVVPR TAATAGAH 
ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 
OEPLAIVGMACRLPGGV 
GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 4 00 
35 asPEELWHLVASGTDAI 

CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 4 50 

TEFPTDRGWDVDAIYD 
CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 
PDPDAIGKTFVRHG GFL 
40 ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 550 
TGATGFDAAFFGISPRE 
GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 

ALAMDPQQRVLLETSW 
AGGCGTTCGAAAGCGCCGGCATCACCCCGGACTCGACCCGCGGCAGCGAC 650 
45 EAFESAG1TPDSTRGSD 

ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 700 

TGVFVGAFSYGYGTGAD 
CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 7 50 
TDGFGATGSQTSVLSG 
50 GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 800 
RLSYFVGLEGPAVTVDT 
GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 850 
ACS SSLVALHQAGQSLR 
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CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 

SGECS LALVGGVTVMA 
CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
SPGGFVEFSRQRGLAPD 
5 GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 
GRAKAFGAGADGTSFAE 
GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGCAACG 1050 

GAGVLIVERLSOAERN 
GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 
10 GHTVLAVVRGSAVNQDG 

GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 

ASNGLSAPNGPSQERVI 
CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 
RQALANAGLTPADVDA 
1 5 TCGAGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 1250 
VEAHGTGTRLGDPIEAQ 
GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 

AVLATYGQERATPLLLG 
CTCGCTGAAGTCCAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 
20 SLKSN IGHAQAASGVA 

GCATCATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 14 00 
GI I KMVQALRHGE LPPT 
CTGCACGCCGACGAGCCGTCGCCGC ACGTCGACTGGACGGCCGGCGCCGT 14 50 
LHADEPS phvdwtagav 
25 CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCTAGGC 1500 
ELLTSARPWPETDRPR 
GGGCAGGCGTGTCGTCCTTCGGGATCAGTGGCACCAACGCCCACGTCATC 1550 
RAGVSSFGISGTNAHVI 
CTGGAAAGCGCACCCCCCACTCAGCCTGCGGACAACGCGGTGATCGAGCG 1600 
30 LESAPPTQPADNAVIER 

GGCACCGGAGTGGGTGCCGTTGGTGATTTCGGCCAGGACCCAGTCGGCTT 1650 

APEWVPLVISARTQSA 
TGACTGAGCACGAGGGCCGGTTGCGTGCGTATCTGGCGGCGTCGCCCGGG 1700 
LTEHEGRLRAYLAASPG 
3 5 GTGGAT ATGCGGGCTGTGGCATCGACGCTGGCGATGACACGGTCGGTGTT 17 50 
VDMRAVASTLAMTRSVF 
CGAGCACCGTGCCGTGCTGCTGGGAGATGACACCGTCACCGGCACCGCTG 1800 

EHRAVLLGDDTVTGTA 
TGTCTGACCCTCGGGCGGTGTTCGTCTTCCCGGGACAGGGGTCGCAGCGT 1850 
40 VSDPRAVFVFPGQGSQR 

GCTGGCATGGGTGAGGAACTGGCCGCCGCGTTCCCCGTCTTCGCGCGGAT 1900 

AGMGEELAAAFPVFARI 
CCATCAGCAGGTGTGGGACCTGCTCGATGTGCCCGATCTGGAGGTGAACG 1950 
HQQVWDLLDVPDLEVN 
45 AGACCGGTTACGCCCAGCCGGCCCTGTTCGCAATGCAGGTGGCTCTGTTC 2000 
ETGYAQPAL FAMQVALF 
GGGCTGCTGGAATCGTGGGGTGT ACGACCGGACGCGGTGATCGGCCATTC 2050 

GLLESWGVRPDAVIGHS 
GGTGGGTGAGCTTGCGGCTGCGTATGTGTCCGGGGTGTGGTCGTTGGAGG 2100 
50 VGELAAAYVSGVWSLE 

ATGCCTGCACTTTGGTGTCGGCGCGGGCTCGTCTGATGCAGGCTCTGCCC 2150 
DACTLVSARARLMQALP 
GCGGGTGGGGTGATGGTCGCTGTCCCGGTCTCGGAGGATGAGGCCCGGGC 2200 
AGGVMVAVPVSEDEARA 
5 5 CGTGCTGGGTG AGGGTGTGGAGATCGCCGCGGTCAACGGCCCGTCGTCGG 2250 
VLGEGVEIAAVNGPSS 
TGGTTCTCTCCGGTGATGAGGCCGCCGTGCTGCAGGCCGCGGAGGGGCTG 2300 
VVLSGDEAAVLQAAEGL 
GGG AAGTGGACGCGGCTGGCGACCAGCCACGCGTTCCATTCCGCCCGTAT 2350 

60 G KWT RLAT S HAFH SARM 

GGAACCCATGCTGGAGGAGTTCCGGGCGGTCGCCGAAGGCCTGACCTACC 24 00 

EPMLEEFRAVAEGLTY 
GGACGCCGCAGGTCTCCATGGCCGTTGGTGATCAGGTGACCACCGCTGAG 2 4 50 
RT P.QVSMAVGDQVTTAE 
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TACTGGGTGCGGCAGGTCCGGGACACGGTCCGGTTCGGCGAGCAGGTGGC 2500 

YWVRQVRDTVRFGEQVA 
CTCG T ACGAGGACGCCGTGTTCGTCGAGCTGGGTGCCGACCGGTCACTGG 2550 
SYEDAVFVELGADRSL 
5 CCCGCCTGGTCGACGGTGTCGCGATGCTGCACGGCGACCACGAAATCCAG 2600 
ARLVDGVAMLHGDHEIQ 
GCCGCGATCGGCGCCCTGGCCCACCTGTATGTCAACGGCGTCACGGTCGA 2650 

AAIGALAHLYVNGVTVD 
CTGGCCCGCGCTCCTGGGCGATGCTCCGGCAACACGGGTGCTGGACCTTC 2700 
10 WPALLGDAPATRVLDL 

CGACATACGCCTTCCAGCACCAGCGCTACTGGCTCGAGTCGGCACGCCCG 27 50 
PTYAFQHQRYWLESARP 
GCCGCATCCGACGCGGGCCACCCCGTGCTGGGCTCCGGTATCGCCCTCGC 2800 
AASDAGHPVLGSG IALA 
1 5 CGGGTCGCCGGGCCGGGTGTTCACGGGTTCCGTGCCGACCGGTGCGGACC 28 50 
GSPGRVFTGSVPTGAD 
GCGCGGTGTTCGTCGCCGAGCTGGCGCTGGCCGCCGCGGACGCGGTCGAC 2900 
R A V FVAE.LALAAADAV D 
TGCGCCACGGTCGAGCGGCTCGACATCGCCTCCGTGCCCGGCCGGCCGGG 2950 
20 CATVERLD IASVPGRPG 

CCATGGCCGGACGACCGTACAGACCTGGGTCGACGAGCCGGCGGACGACG 3000 

HGRTTVQTWVDE PAD P 
GCCGGCGCCGGTTCACCGTGCACACCCGCACCGGCGACGCCCCGTGGACG 3050 
GRRRFTVH TRTGDAPWT 
25 CTGCACGCCfGAGGGGGTGCTGCGCCCCCATGGCACGGCCCTGCCCGATGC 3100 
LHAEGVLRPHGTALPDA 
GGCCG ACGCCGAGTGGCCCCCACCGGGCGCGGTGCCCGCGGACGGGCTGC 3150 

ADAEWPPPGAVPADGL 
CGGGTGTGTGGCGCCGGGGGGACCAGGTCTTCGCCGAGGCCGAGGTGGAC 3200 
30 PGVWRRGDQVFAEAEVD 

GGACCGGACGGTTTCGTGGTGCACCCCGACCTGCTCGACGCGGTCTTCTC 3250 

GPDG FVVH PDLLDAVFS 
CGCGGTCGGCGACGGAAGCCGCCAGCCGGCCGGATGGCGCGACCTGACGG 3300 
AVGDGSRQPAGWRDLT 
35 TGCACGCGTCGGACGCCACCGTACTGCGCGCCTGCCTCACCCGGCGCACC 3350 
V HAS DATVLRACLT RRT 
G ACGG AGCCATGGGATTCGCCGCCTTCGACGGCGCCGGCCTGCCGGTACT 34 00 

DGAMGFAAFDGAGLPVL 
CACCGCGGAGGCGGTGACGCTGCGGGAGGTGGCGTCACCGTCCGGCTCCG 34 50 
40 TAEAVTLREVAS PSGS 

AGGAGTCGGACGGCCTGCACCGGTTGGAGTGGCTCGCGGTCGCCGAGGCG 3500 
EESDGLHRLEWLAVAEA 
GTCTACGACGGTGACCTGCCCGAGGGACATGTCCTGATCACCGCCGCCCA 3550 
VYDGDLPEGHVL .ITAAH 
45 CCCCGACGACCCCGAGGACATACCCACCCGCGCCCACACCCGCGCCACCC 3600 
PDD PEDIPTRAHTRAT 
GCGTCCTGACCGCCCTGCAACACCACCTCACCACCACCGACCACACCCTC 3650 
RVLTALQHHLTTTDHTL 
ATCGTCCACACCACCACCGACCCCGCCGGCGCCACCGTCACCGGCCTCAC 3700 
50 I VHTTTDPAGATVTGLT 

CCGCACCGCCCAGAACGAACACCCCCACCGCATCCGCCTCATCGAAACCG 3750 

RTAQNEHPHRIRLIET 
ACCACCCCCACACCCCCCTCCCCCTGGCCCAACTCGCCACCCTCGACCAC 3800 
DHPHTPLPLAQLATLDH 
55 CCCCACCTCCGCCTCACCCACCACACCCTCCACCACCCCCACCTCACCCC 38 50 
PHLRL THHTLHHPHLTP 
CCTCCACACCACCACCCCACCCACCACCACCCCCCTCAACCCCGAACACG 3900 

LHTTTPPTTTPLNPEH 
CCATCATCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGC 3950 
60 AI I ITGGSGTLAGILA^R 

CACCTGAACCACCCCCACACCTACCTCCTCTCCCGCACCCCACCCCCCGA 4000 

HLNHPHTYLLSRTP PPD 
CGCCACCCCCGGCACCCACCTCCCCTGCGACGTCGGCGACCCCCACCAAC 4 050 
AT PGTHL PCDVG DPHQ 
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TCGCCACCACCCTCACCCACATCCCCCAACCCCTCACCGCCATCTTCCAC 4100 
LATTLTHI PQPLTAIFH 
ACCGCCGCCACCCTCGACGACGGCATCCTCCACGCCCTCACCCCCGACCG 4150 
TAATLDDGILHALTPDR 
5 CCTCACCACCGTCCTCCACCCCAAAGCCAACGCCGCCTGGCACCTGCACC 4 200 
LTTVLH PKANAAWHLH 
ACCTCACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCC 4 250 
HLTQNQPLTHFVLYSSA 
GCCGCCGTCCTCGGCAGCCCCGGACAAGGAAACTACGCCGCCGCCAACGC 4 300 
10 AAVLGS PGQGNYAAANA 

CTTCCTCGACGCCCTCGCCACCCACCGCCACACCCTCGGCCAACCCGCCA 4 350 

FLDALATHRHTLGQPA 
CCTCCATCGCCTGGGGCATGTGGCAC ACC ACCAGCACCCTCACCGGACAA 4 4 00 
TS iaWGMWHTTSTLTGQ 
1 5 cTCGACGACGCCGACCGGGACCGCATCCGCCGCGGCGGTTTCCTCCCGAT 4 4 50 

lddadrdrirrggflpi 
cacggacgacgagggcatggggatgcat 

T D D E G 

20 The AvrW-Xhol restriction fragment that encodes module 8 of the FK-520 PKS 

with the endogenous AT domain replaced by the AT domain of module 13 (specific for 
methylmalonyl CoA) of the rapamycin PKS has the DNA sequence and encodes the 
amino acid sequence shown below. 

AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 50 
25 QLAEALLTLVREST : . 

GCCGCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 

AAVLGHVGGE DI PATAA 
GTTCAAGGACCTCGGCATCGACTCGCTCACCGCGGTCCAGCTGCGCAACG 150 
FKDLGIDSLTAVQLRN 
30 CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 200 
ALTEATGVRLNATAVFD 
TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 

FPT PHVLAGKLGDELTG 
CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 300 
35 TRAPVVPRTAATAGAH 

ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 
DEPLAIVGMACRLPGGV 
GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 4 00 
ASPEEL WHLVASGTDAI 
40 CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 4 50 
TEFPTDRGWDVDAIYD 
CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 
PDPDAIGKTFVRHGGFL 
ACCGGCGCGACAGGCTTCG ACGCGGCGTTCTTCGGCATCAGCCCGCGCG A 550 

TGATG FDAAFFG I S P RE 
GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 

ALAM D pQQRVLLETSW 
AGGCGTTCGAAAGCGCCGGCATCACCCCGGACTCGACCCGCGGCAGCGAC 650 
EAFESAGITPDSTRGSD 
ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 700 

TGVFVGAFSYGYGTGAD 
CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 7 50 

TDGFGATGSQTSVLSG 
GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 800 
RLSYFYGLEGPAVTVDT 
GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 8 50 

ACSSSLVALHQAGQSLR 
CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 
SGECSLALVGGYTVMA 
60 CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
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SPGGFVEFSRQRGLAPD 
GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 

GRAKAFGAGADGTSFAE 
GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGCAACG 1050 

GAGVLIVERLSDAERN 
GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 
GHTVLAVVRGSAVNQDG 
GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 

ASNGLSAPNGPSQERVI 
CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 

RQALANAGLTPADVDA 
TCGAGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 1250 
VEAHGTGTRLGDP I EAQ 
GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 

AVLATYGQERATPLLLG 
CTCGCTGAAGTCCAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 

SLKSNI GHAQAASGVA 
GCATCATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 14 00 
GI I KMVQALRHGELPPT 
CTGCACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 14 50 

LHADEPS PHVDWTAGAV 
CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCTAGGC 1500 

ELLTSARPWPETDRPR 
GGGCGGGCGTGTCGTCCTTCGGAGTC AGCGGCACCAACGCCCACGTCATC 1550 
RAGVSS FGVSGTNAHVI 
CTGGAGAGCGCACCCCCCGCTCAGCCCGCGGAGGAGGCGCAGCCTGTTGA 1600 

LESAPPAQPAEEAQPVE 
GACGCCGGTGGTGGCCTCGGATGTGCTGCCGCTGGTGATATCGGCCAAGA 1650 

TPVVASDVLPLVISAK 
CCCAGCCCGCCCTGACCG AACACGAAGACCGGCTGCGCGCCTACCTGGCG 1700 
TQPALTEHEDRLRAYLA 
GCGTCGCCCGGGGCGGATATACGGGCTGTGGCATCGACGCTGGCGGTGAC 17 50 

AS PGADI RAVASTLAVT 
ACGGTCGGTGTTCGAGCACCGCGCCGTACTCCTTGGAGATGACACCGTCA 1800 

RSVFEH RAVLLGDDTV 
CCGGGACCGCGGTGACCG ACCCCAGGATCGTGTTTGTCTTTCCCGGGCAG 1850 
TGTAVTDPRIVFVFPGQ 
GGGTGGCAGTG GCTGGGGATGGGCAGTGCACTGCGCGATTCGTCGGTGGT 1900 

GWQWLGMGSALRDSSVV 
GTTCGCCGAGCGGATGGCCGAGTGTGCGGCGGCGTTGCGCGAGTTCGTGG 1950 

FAERMAECAAALREFV 
ACTGGGATCTGTTCACGGTTCTGGATGATCCGGCGGTGGTGGACCGGGTT 2000 
DWDLFTVLDDPAVVDRV 
GATGTGGTCCAGCCCGCTTCCTGGGCGATGATGGTTTCCCTGGCCGCGGT 2050 

DVVOPASWAMMVSLAAV 
GTGGCAGGCGGCCGGTGTGCGGCCGGATGCGGTGATCGGCCATTCGCAGG 2100 

WQAAGVRPDAVI GHSQ 
GTGAGATCGCCGCAGCTTGTGTGGCGGGTGCGGTGTCACTACGCGATGCC 2150 
GEIAAACVAGAVSLRDA 
GCCCGGATCGTGACCTTGCGCAGCCAGGCGATCGCCCGGGGCCTGGCGGG 2200 

ARI VTLRSQAIARGLAG 
CCGGGGCGCGATGGCATCCGTCGCCCTGCCCGCGCAGGATGTCGAGCTGG 2250 

RGAMAS VALPAQDVEL 
TCGACGGGGCCTGGATCGCCGCCCACAACGGGCCCGCCTCCACCGTGATC 2300 
VDGAWIAAHNGPASTVI 
GCGGGCACCCCGGAAGCGGTCGACCATGTCCTCACCGCTCATGAGGCACA 2350 

AGTPEAVDHVLTAHEAQ 
AGGGGTGCGGGTGCGGCGGATCACCGTCGACTATGCCTCGCACACCCCGC 24 00 

GVRVRRITVDYASHTP 
ACGTCGAGCTGATCCGCGACGAACTACTCGACATCACTAGCGACAGCAGC 24 50 
HVELIRDELLDITSDSS- 
TCGCAGACCCCGCTCGTGCCGTGGCTGTCGACCGTGGACGGCACCTGGGT 2500 

SQTPLVPWLSTVDGTWV 
CGACAGCCCGCT'GGACGGGGAGTACTGGTACCGGAACCTGCGTGAACCGG 2550 
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DSPLDGEYWYRNLREP 
TCGGTTTCCACCCCGCCGTCAGCC AGTTGCAGGCCC AGGGCG ACACCGTG 2600 
VGFHPAVSQLQAQGDTV 
TTCGTCGAGGTCAGCGCCAGCCCGGTGTTGTTGCAGGCGATGGACGACGA 2650 
5 FVEVSAS PVLLQAMDDD 

TGTCGTCACGGTTGCCACGCTGCGTCGTGACGACGGCGACGCCACCCGGA 27 00 

VVTVATLRRDDG DATR 
TGCTCACCGCCCTGGCACAGGCCTATGTCCACGGCGTCACCGTCGACTGG 27 50 
MLTALAQAYVHGVTVDW 
1 0 CCCGCCATCCTCGGCACCACCACAACCCGGGTACTGGACCTTCCGACCTA 2800 
PAI LGTTTTRVLDLPTY 
CGCCTTCC AACACCAGCGGT ACTGGCTCGAGTCGGCACGCCCGGCCGCAT 2850 

AFQHQRYWLESARPAA 
CCGACGCGGGCCACCCCGTGCTGGGCTCCGGTATCGCCCTCGCCGGGTCG 2 900 
15 SDAGHPVLGSGIALAGS 

CCGGGCCGGGTGTTCACGGGTTCCGTGCCGACCGGTGCGGACCGCGCGGT 2 950 

PGRVFTGSVPTGADRAV 
GTTCGTCGCCGAGCTGGCGCTGGCCGCCGCGGACGCGGTCGACTGCGCCA 3000 
FVAE LALAAADAVDCA 
20 CGGTCGAGCGGCTCGACATCGCCTCCGTGCCCGGCCGGCCGGGCCATGGC 3050 
TVERLDIASVPGRPGHG 
CGGACGACCGTACAGACCTGGGTCGACGAGCCGGCGGACGACGGCCGGCG 3100 

RTTVQTWVDEPADDGRR 
CCGGTTCACCGTGCACACCCGCACCGGCGACGCCCCGTGGACGCTGCACG 3150 
25 RFTVHTRTGDAPW TL H 

CCGAGGGGGTGCTGCGCCCCCATGGCACGGCCCTGCCCGATGCGGCCGAC 3200 
AEGVLRPHGTALPDAAD 
GCCGAGTGGCCCCCACCGGGCGCGGTGCCCGCGGACGGGCTGCCGGGTGT 3250 
AEWPPPGAVPADGLPGV 
30 GTGGCGCCGGGGGGACCAGGTCTTCGCCGAGGCCGAGGTGGACGGACCGG 3300 
WRRGDQVFAEAEVDGP 
ACGGTTTCGTGGTGCACCCCGACCTGCTCGACGCGGTCTTCTCCGCGGTC 3350 
DGFVVH PDLLDAVFSAV 
GGCGACGGAAGCCGCCAGCCGGCCGGATGGCGCGACCTGACGGTGCACGC 34 00 
35 GDGSRQPAGWRDLTVHA 

GTCGG ACGCCACCGTACTGCGCGCCTGCCTCACCCGGCGCACCGACGGAG 34 50 

SDATVLRACLTRRT DG 
CCATGGGATTCGCCGCCTTCGACGGCGCCGGCCTGCCGGTACTCACCGCG 3500 
AMG FAAFDGAGLPVLTA 
40 GAGGCGGTGACGCTGCGGGAGGTGGCGTCACCGTCCGGCTCCGAGGAGTC 3550 
EAVTLREVASPSGSEES 
GGACGGCCTGCACCGGTTGGAGTGGCTCGCGGTCGCCGAGGCGGTCTACG 3600 

DGLHRLEWLAVAEAVY 
ACGGTGACCTGCCCGAGGGACATGTCCTGATCACCGCCGCCCACCCCGAC 3650 
45 DGDLPEGHVLITAAHPD 

GACCCCGAGGACAT ACCCACCCGCGCCCACACCCGCGCCACCCGCGTCCT 3700 

DPEDI PTRAHTRATRVL 
GACCGCCCTGCAACACCACCTCACCACCACCGACCACACCCTCATCGTCC 3750 
TALQHHLTTTDHTLIV 
50 ACACCACCACCGACCCCGCCGGCGCCACCGTCACCGGCCTCACCCGCACC 3800 
HTTT DPAGATVTGLTRT 
GCCCAGAACGAACACCCCCACCGCATCCGCCTCATCGAAACCGACCACCC 3850 

AQN. EHPHRIRLI ETDHP 
CCACACCCCCCTCCCCCTGGCCCAACTCGCCAGCCTCGACCACCCCCACC 3900 
55 HTPLPLAQLATLDHPH 

TCCGCCTCACCCACCACACCCTCCACCACCCCCACCTCACCCCCCTCCAC 3950 
LRLTHHT LHHPHLTPLH 
ACCACCACCCCACCCACCACCACCCCCCTCAACCCCGAAC ACGCCATCAT 4 000 
TTTPPTTTPLNPEHAI I 
60 CATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCACCTGA 4 050 
I TGGSGTLAGILARHL 
ACCACCCCCACACCTACCTCCTCTCCCGCACCCCACCCCCCGACGCCACC 4100 
NHPHTYLL SRTPPPDAT 
CCCGGCACCCACCTCCCCTGCGACGTCGGCGACCCCCACCAACTCGCCAC 4150 
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PGTHLPCDVGDPHQLAT 
CACCCTCACCCACATCCCCCAACCCCTCACCGCCATCTTCCACACCGCCG 4200 

TLTH I PQPLTAI F H T A 
CCACCCTCGACGACGGCATCCTCCACGCCCTCACCCCCGACCGCCTCACC 4250 
5 ATLDDG I LHALTPDRLT 

ACCGTCCTCCACCCCAAAGCCAACGCCGCCTGGCACCTGCACCACCTCAC 4 300 

TVLHPKANAAWHLHHLT 
CCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCCGCCGCCG 4 350 
1A QNQPLTH FVLYSSAAA 

1 0 TCCTCGGC AGCCCCGGACAAGGAAACTACGCCGCCGCCAACGCCTTCCTC 4 4 0 0 
V L G S PGQGNYAAANAFL 
GACGCCCTCGCCACCCACCGCCACACCCTCGGCCAACCCGCCACCTCCAT 4 4 50 

DALATHRHTLGQPATSI 
CGCCTGGGGCATGTGGCACACCACCAGCACCCTCACCGGACAACTCGACG 4 500 ' 
15 AWGMWHTTSTLTGQLD 

ACGCCGACCGGGACCGCATCCGCCGCGGCGGTTTCCTCCCGATCACGGAC 4 550 
DADRDRIRRGGFLPITD 
GACGAGGGCATGGGGATGCAT 
0 £ G 

20 

The Nhell-Xhol restriction fragment that encodes module 8 of the FK-520 PKS 
with the endogenous AT domain replaced by the AT domain of module 12 (specific for 
malonyl CoA) of the rapamycin PKS has the DNA sequence and encodes the amino acid 
sequence shown below. 

25 AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 50 
QLAEALLTLVREST 
GCCGCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 

AAVLGHVGGEDI PATAA 
GTTCAAGGACCTCGGCATCGACTCGCTCACCGCGGTCCAGCTGCGCAACG 1 50 
30 FKDLG I DSLTAVQLRN 

CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 200 
ALTEATGVRLNATAVFD 
TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 
FPTPHVLAGKLGDELTG 
35 CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 300 
TRAPVVPRTAATAGAH 
ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 
DEPLAI VGMACRLPGGV 
GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 4 00 
40 AS PEELWHLVASGTDAI 

CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 4 50 

TE FPTDRGWDVDAIYD 
CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 
PDPDAIGKTFVRHGGFL 
45 ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 550 
TGATG FDAAFFGI S PRE 
GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 

ALAMDPQQRVLLETSW 
AGGCGTTCG AAAGCGCCGGCATCACCCCGGACTCGACCCGCGGC AGCGAC 650 
50 EAFESAG ITPDSTRGSD 

ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 700 

TGVFVGAFSYGYGTGAD 
CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 750 
TDGFGATGSQTSVLSG 
55 GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 800 
RLSYFYGLEGPAVTVDT 
GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 850 

ACSSSLVALHQAGQSLR 
CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 
60 SGECSLALVGGVTVMA 
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CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
SPGGFVEFSRQRGLAPD 
GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 
GRAKA FGAGADGTS FAE 
5 GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGCAACG 1050 
GAGVLIVERLSDAERN 
GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 
GHTVLAVVRGSAVNQDG 
GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 
10 ASNGLSAPNGPSQERVI 

CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 

RQALANAGLTPADVDA 
TCGAGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 1250 
VEAHGTGTRLGDPIEAQ 
1 5 GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 
AVLATYGQERATPLLLG 
CTCGCTG AAGTCCAACATCGGCC ACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 

S LKSN IGHAQAASGVA 
GCATCATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 14 00 
20 GIIKMVQALR-HGELPPT 

CTGCACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 14 50 

L HADE PS PHVDWTAGAV 
CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCACGGC 1500 
ELLTSARPWPETDRPR 
25 GTGCCGCCGTCTCCTCGTTCGGGGTGAGCGGCACCAACGCCCACGTCATC 1550 
RAAVSS FGVSGTNAHVI 
CTGGAGGCCGGACCGGTAACGGAGACGCCCGCGGCATCGCCTTCCGGTGA 1600 

LEAG PVTET PAAS. PSGD 
CCTTCCCCTGCTGGTGTCGGCACGCTCACCGGAAGCGCTCGACGAGCAGA 1650 
30 LPLLVSARSPEALDEQ 

TCCGCCGACTGCGCGCCTACCTGGACACCACCCCGGACGTCGACCGGGTG 1700 
I RRLRAYLDTTPDVDRV 
GCCGTGGCACAGACGCTGGCCCGGCGCACACACTTCGCCCACCGCGCCGT 1750 
AVAQT LARRTHFAHRAV 
35 GCTGCTCGGTGACACCGTCATCACCACACCCCCCGCGGACCGGCCCGACG 1800 
LLGDTVITTPPAD'RPD 
AACTCGTCTTCGTCTACTCCGGCCAGGGCACCCAGCATCCCGCGATGGGC 1850 
ELVFVYSGQGTQHPAMG 
GAGCAGCTAGCCGCCGCGTTCCCCGTCTTCGCGCGGATCCATCAGCAGGT 1 900 
40 EQ, LAAAFPVFARI HQQV 

GTGGGACCTGCTCGATGTGCCCGATCTGGAGGTGAACGAGACCGGTTACG 1950 

WDLLDVPDLEVNETGY 
CCCAGCCGGCCCTGTTCGCAATGCAGGTGGCTCTGTTCGGGCTGCTGGAA 2000 
AQ PAL FAMQVA LFGLLE 
45 TCGTGGGGTGT ACGACCGGACGCGGTGATCGGCCATTCGGTGGGTGAGCT 2050 
SWGVRPOAVIGHSVGEL 
TGCGGCTGCGTATGTGTCCGGGGTGTGGTCGTTGGAGGATGCCTGCACTT 2100 

AAAYVSGVWSLEDACT 
TGGTGTCGGCGCGGGCTCGTCTGATGCAGGCTCTGCCCGCGGGTGGGGTG 2150 
50 LVSARARLMQAL PAGGV 

ATGGTCGCTGTCCCGGTCTCGGAGGATGAGGCCCGGGCCGTGCTGGGTGA 2200 

MVAVPVS E DEARAVLGE 
GGGTGTGGAGATCGCCGCGGTCAACGGCCCGTCGTCGGTGGTTCTCTCCG 2250 
GVEIAAVNGPSSVVLS 
55 GTGATGAGGCCGCCGTGCTGCAGGCCGCGGAGGGGCTGGGGAAGTGGACG 2300 
GDEAAVLQAAEGLGKWT 
CGGCTGGCGACCAGCCACGCGTTCCATTCCGCCCGTATGGAACCCATGCT 2350 

RLATSHAFHSARMEPML 
GGAGGAGTTCCGGGCGGTCGCCGAAGGCCTGACCTACCGGACGCCGCAGG 24 00 
60 EEFRAVAEGLTYRTPQ 

TCTCCATGGCCGTTGGTGATCAGGTGACCACCGCTGAGTACTGGGTGCGG 24 50 
VSMAVGDQVTTAEYWVR 
CAGGTCCGGGACACGGTCCGGTTCGGCGAGCAGGTGGCCTCGTACGAGGA 2500 
QVRDTVRFGEQVASYED 
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CGCCGTGTTCGTCGAGCTGGGTGCCGACCGGTCACTGGCCCGCCTGGTCG 2550 

AVFVELGADRSLARLV 
ACGGTGTCGCGATGCTGCACGGCGACCACGAAATCCAGGCCGCGATCGGC 2 600 
DGVAMLHGDHEIQAAIG 
5 GCCCTGGCCCACCTGTATGTCAACGGCGTCACGGTCGACTGGCCCGCGCT 2650 
ALAH LYVNGVTVDWPAL 
CCTGGGCGATGCTCCGGCAACACGGGTGCTGGACCTTCCGACATACGCCT 2 7 00 

LGDAPATRVLDLPTYA 
TCCAGCACC AGCGCTACTGGCTCGAGTCGGCACGCCCGGCCGCATCCGAC 2750 
10 FQHQRYWLESARPAASD 

GCGGGCCACCCCGTGCTGGGCTCCGGTATCGCCCTCGCCGGGTCGCCGGG 2800 

A G H PVLGSGIALAGSPG 
CCGGGTGTTCACGGGTTCCGTGCCGACCGGTGCGGACCGCGCGGTGTTCG 2850 
RVFTGSVPTGADRAVF 
15 TCGCCGAGCTGGCGCTGGCCGCCGCGGACGCGGTCGACTGCGCCACGGTC 2900 
VAELALAAADAVDCATV 
GAGCGGCTCGACATCGCCTCCGTGCCCGGCCGGCCGGGCCATGGCCGGAC 2950 

ERLDIASVPGRPGH GRT 
GACCGTACAGACCTGGGTCGACGAGCCGGCGGACGACGGCCGGCGCCGGT 3000 
20 TVQTWVDEPADDGRRR 

TCACCGTGCACACCCGCACCGGCGACGCCCCGTGGACGCTGCACGCCGAG 3050 
FTVHTRTGDAPWTLHAE 
GGGGTGCTGCGCCCCCATGGCACGGCCCTGCCCGATGCGGCCGACGCCGA 3100 
GVLRPHGTALPDAADA E 
25 GTGGCCCCCACCGGGCGCGGTGCCCGCGGACGGGCTGCCGGGTGTGTGGC 3150 
W P P PGAVPADGLPGVW 
GCCGGGGGGACCAGGTCTTCGCCGAGGCCGAGGTGGACGGACCGGACGGT 3200 
RRGDQVFAEAEVDGPDG 
TTCGTGGTGCACCCCGACCTGCTCGACGCGGTCTTCTCCGCGGTCGGCGA 3250 
30 FVVHPDLLDAVFSAVGD 

CGGAAGCCGCCAGCCGGCCGGATGGCGCGACCTGACGGTGCACGCGTCGG 3300 

GSRQPAGWRDLTVHAS 
ACGCCACCGTACTGCGCGCCTGCCTCACCCGGCGCACCGACGGAGCCATG 3350 
DATVLRACLTRRTDGAM 
35 GGATTCGCCGCCTTCGACGGCGCCGGCCTGCCGGTACTCACCGCGGAGGC 3400 
GFAAFDGAGLPVLTAEA 
GGTGACGCTGCGGGAGGTGGCGTCACCGTCCGGCTCCGAGGAGTCGGACG 34 50 

VTLREVASPSGSEESD 
GCCTGCACCGGTTGGAGTGGCTCGCGGTCGCCGAGGCGGTCTACGACGGT 3500 
40 GLHRLEWLAVAEAVYDG 

GACCTGCCCGAGGGACATGTCCTGATCACCGCCGCCCACCCCGACGACCC 3550 

DLPEGHVLITAAHPDDP 
CGAGGACATACCCACCCGCGCCCACACCtGCGCCACCCGCGTCCTGACCG 3600 
EDI PTRAHTRATRVLT 
45 CCCTGCAAC ACCACCTCACCACCACCG ACCACACCCTCATCGTCCACACC 3650 
ALQHHLTTTDHTLIVHT 
ACCACCGACCCCGCCGGCGCCACCGTCACCGGCCTCACCCGCACCGCCCA 3700 

TT D PAG ATVTGLTRTAQ 
GAACGAACACCCCCACCGCATCCGCCTCATCGAAACCGACCACCCCCACA 3750 
50 NEH PHRI RLIETDHPH 

CCCCCCTCCCCCTGGCCCAACTCGCC ACCCTCGACCACCCCCACCTCCGC 3800 
TPL.PLAQLATLDHPHLR 
CTCACCCACCAC ACCCTCCACCACCCCCACCTCACCCCCCTCCACACCAC 38 50 
LTHHTLHHPH LTPLHTT 
55 CACCCCACCCACCACCACCCCCCTCAACCCCGAACACGCCATCATCATCA 3900 
TPPTTTPLNPEHAI II 
CCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCACCTGAACCAC 3950 
TGGSGTLAGI LARHLNH 
CCCCACACCTACCTCCTCTCCCGCACCCCACCCCCCGACGCCACCCCCGG 4 000 
■60 PHT YLLSRTPPPDATPG 

CACCCACCTCCCCTGCGACGTCGGCGACCCCCACCAACTCGCCACCACCC 4050 

THLPCDVGDPHQLATT 
TCACCCACATCCCCCAACCCCTCACCGCCATCTTCCACACCGCCGCCACC 4100 
LTH I PQPLTAIFHTAAT 

84 



BNSDOCID: <WO 002060 1A2 I > 



WO 00/20601 



PCI7US99/22886 



CTCGACGACGGCATCCTCCACGCCCTCACCCCCGACCGCCTCACCACCGT 4150 

LDDGILHALTPDRLTTV 
CCTCCACCCCAAAGCCAACGCCGCCTGGCACCTGCACCACCTCACCCAAA 4 200 
LHPKANAAWHLHHLTQ 
5 ACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCCGCCGCCGTCCTC 4 250 
NQPLTH FVLYSSAAAVL 
GGCAGCCCCGGACAAGGAAACTACGCCGCCGCCAACGCCTTCCTCGACGC 4 300 

GSPGQGNYAAANAFLDA 
CCTCGCCACCCACCGCCACACCCTCGGCCAACCCGCCACCTCCATCGCCT 4 350 
10 LAT HRHT LGQPATSIA 

GGGGCATGTGGCACACCACCAGCACCCTCACCGGACAACTCGACGACGCC 4 4 00 
WGMWHTTSTLTGQLDDA 
GACCGGGACCGCATCCGCCGCGGCGGTTTCCTCCCGATCACGGACGACGA 4 4 50 
DRDRIRRGGFLPITDDE 
1 5 GGGCATGGGGATGCAT 
G 

The Nhell-Xhol restriction fragment that encodes module 8 of the FK-520 PKS 
with the endogenous AT domain replaced by the AT domain of module 13 (specific for 
20 methylmalonyl CoA) of the rapamycin PKS has the DNA sequence and encodes the 
amino acid sequence shown below. 

AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 50 

QLA-EALLTLVREST 
GCCGCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 
25 AAVLGHVGGEDI PATAA 

.GTTCAAGGACCTCGGCATCGACTCGCTCACCGCGGTCCAGCTGCGCAACG 150 

FKDLG I DSLTAVQLRN 
CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 200 
ALTEAT GVRLNATAVFD 
30 TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 
FPTPHVLAGKLGDELTG 
CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 300 

TRAPVVPRTAATAGAH 
ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 
35 DE PLAI VGMACRLPGGV 

GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 4 00 

AS PEELWHLVASGTDAI 
CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 4 50 
TEFPTDRGW. DVDAIYD 
40 CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 
PDPDAIGKT FVRHGGFL 
ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 550 

TGATGFDAAFFGI SPRE 
GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 
45 ALAMDPQQRVLLETSW 

AGGCGTTCGAAAGCGCCGGC ATCACCCCGGACTCGACCCGCGGCAGCGAC 650 
EAFESAGITPDSTRGSD 
ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 7 00 
TGVFVGAFS YGYGTGAD 
50 CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 7 50 
TDGFGATGSQTSVLSG 
GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 800 
RLSYFYGLEGPAVTVDT 
GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 8 50 
55 ACSSSLVALHQAGQSLR 

CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 

SG ECS LALVGGVTVMA 
CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
S PG G FVE FS RQRG LAPD 
60 GGCCGG7CGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 
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grakafgagadgtsfae 
gggtgccggtgtgctgatcgtcgagaggctctccgacgccgaacgcaacg 1050 
gagvl iverls daern 

GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 
GHTVLAVVRGSAVNQDG 
GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 

ASNGLSAPNGPSQERVI 
CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 

RQALANAGLTPADVDA 
TCG AGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 1250 
VEAHGTGTRLGDPIEAQ 
GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 

AVLATYGQERAT PLLLG 
CTCGCTGAAGTCCAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 
15 SLKSNIGHAQAASGVA 

GC ATCATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 14 00 
G I I KMVQALRHGELPPT 
CTGCACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 14 50 
LHADEPS PHVDWTAGAV 
20 CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCACGGC 1500 
E L LT SARPWPET DRPR 
GTGCCGCCGTCTCCTCGTTCGGGGTGAGCGGCACCAACGCCCACGTCATC 1550 
RAAVS S FGVSGTNAHVI 
CTGGAGGCCGGACCGGTAACGGAGACGCCCGCGGCATCGCCTTCCGGTGA 1600 
25 LEAGPVTETPAASPSGD 

CCTTCCCCTGCTGGTGTCGGCACGCTCACCGGAAGCGCTCGACGAGCAGA 1650 

LPLLVSARSPEALDEQ 
TCCGCCGACTGCGCGCCTACCTGGACACCACCCCGGACGTCGACCGGGTG 1700 
IRRLRAYLDTTPDVDRV 
30 GCCGTGGCACAGACGCTGGCCCGGCGCACACACTTCGCCCACCGCGCCGT 1750 
AVAQT-LARRTH FAHRAV 
GCTGCTCGGTGACACCGTCATCACCACACCCCCCGCGGACCGGCCCGACG 1800 

LLGDTVITTPPADRPD 
AACTCGTCTTCGTCTACTCCGGCCAGGGCACCCAGCA7CCCGCGATGGGC 1850 
35 ELVFVYSGQGTQH P AMG 

GAGCAGCTAGCCGATTCGTCGGTGGTGTTCGCCGAGCGGATGGCCGAGTG 1900 

EQLADSSVVFA ERMAEC 
TGCGGCGGCGTTGCGCGAGTTCGTGGACTGGGATCTGTTCACGGTTCTGG 1950 
AAALREFVDWDLFTVL 
40 ATGATCCGGCGGTGGTGGACCGGGTTGATGTGGTCCAGCCCGCTTCCTGG 2000 
DDPAVVDRVDVVQPASW 
GCGATGATGGTTTCCCTGGCCGCGGTGTGGCAGGCGGCCGGTGTGCGGCC 2050 

A M M V S L A AVWQ AA G V R P 
GGATGCGGTGATCGGCCATTCGCAGGGTGAGATCGCCGCAGCTTGTGTGG 2100 
45 DAVIGHSQGEIAAACV 

CGGGTGCGGTGTCACTACGCGATGCCGCCCGGATCGTGACCTTGCGCAGC 2150 
AGAVSLRDAARI VTLRS 
CAGGCG ATCGCCCGGGGCCTGGCGGGCCGGGGCGCGATGGCATCCGTCGC 2200 
c QA I ARGLAGRGAMASVA 

50 CCTGCCCGCGCAGGATGTCGAGCTGGTCGACGGGGCCTGGATCGCCGCCC 2250 
LPAQDVELVDGAW IAA 
ACAACGGGCCCGCCTCCACCGTGATCGCGGGCACCCCGGAAGCGGTCGAC 2300 
HNG PASTVIAGT PEAVD 
CATGTCCTCACCGCTCATGAGGCACAAGGGGTGCGGGTGCGGCGGATCAC 2350 
55 HVLTAHEAQGVRVRRIT 

CGTCG ACTATGCCTCGCACACCCCGCACGTCGAGCTGATCCGCGACGAAC 24 00 

VDYASHTPHVELIRDE 
TACTCGACATCACTAGCGACAGCAGCTCGCAGACCCCGCTCGTGCCGTGG 24 50 
LLDITSDSSSQTPLVPW 
60 CTGTCGACCGTGGACGGCACCTGGGTCGACAGCCCGCTGGACGGGGAGTA 2500 
LSTVDGTWVDS PLDGEY 
CTGGTACCGGAACCTGCGTGAACCGGTCGGTTTCCACCCCGCCGTCAGCC 2550 

WYRNLREPVGFHPAVS 
AGTTGCAGGCCCAGGGCGACACCGTGTTCGTCGAGGTCAGCGCCAGCCCG 2600 
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QLQAQGDTVFVEVSASP 
GTGTTGTTGCAGGCGATGGACGACGATGTCGTCACGGTTGCCACGCTGCG 2 650 

VLLQAMDDDVVTVATLR 
TCGTGACGACGGCGACGCCACCCGGATGCTCACCGCCCTGGCACAGGCCT 2700 
5 RDDGDATRMLTALAQA 

ATGTCCACGGCGTCACCGTCGACTGGCCCGCCATCCTCGGCACCACCACA 2750 
YVHGVTVDWPAILGTTT 
ACCCGGGTACTGGACCTTCCGACCTACGCCTTCCAACACCAGCGGTACTG 2800 
TRVLDL PTYAFQHQRYW 
1 0 GCTCGAGTCGGCACGCCCGGCCGCATCCGACGCGGGCCACCCCGTGCTGG 2 850 
LESARPAAS DAGHPVL 
GCTCCGGTATCGCCCTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTTCC 2900 
GSG IALAGSPGRVFTGS 
GTGCCGACCGGTGCGGACCGCGCGGTGTTCGTCGCCGAGCTGGCGCTGGC 2 950 
15 VPTGADRAVFVAELALA 

CGCCGCGGACGCGGTCGACTGCGCCACGGTCGAGCGGCTCGACATCGCCT 3000 

AADAVDCATVERLDIA 
CCGTGCCCGGCCGGCCGGGCCATGGCCGGACGACCGT ACAGACCTGGGTC 3050 
SVPGRPGHGRTTVQTWV 
20 GACGAGCCGGCGGACGACGGCCGGCGCCGGTTCACCGTGCACACCCGCAC 3100 
OEPADDGRRRFTVHTRT 
CGGCGACGCCCCGTGGACGCTGCACGCCGAGGGGGTGCTGCGCCCCCATG 3150 

GOAPWTLHAEGVLRPH 
GCACGGCCCTGCCCGATGCGGCCGACGCCGAGTGGCCCCCACCGGGCGCG 3200 
25 GTALPDAADAEWPPPGA 

GTGCCCGCGGACGGGCTGCCGGGTGTGTGGCGCCGGGGGGACCAGGTCTT 3250 

VPADGLPGVWRRGDQVF 
CGCCGAGGCCGAGGTGGACGGACCGGACGGTTTCGTGGTGCACCCCGACC 3300 
AEAEVDGPDGFVVHPD 
30 TGCTCGACGCGGTCTTCTCCGCGGTCGGCGACGGAAGCCGCC AGCCGGCC 3350 
LLDAVFSAVGDGSRQPA 
GGATGGCGCGACCTG ACGGTGCACGCGTCGGACGCCACCGTACTGCGCGC 34 00 

GWRDLTVHASDATVLRA 
CTGCCTCACCCGGCGCACCGACGGAGCCATGGGATTCGCCGCCTTCGACG 3450 
35 CLTRRTDGAMGFAAFD 

GCGCCGGCCTGCCGGT ACTCACCGCGGAGGCGGTGACGCTGCGGGAGGTG 3500 
GAGLPVLTA EAVTLREV 
GCGTCACCGTCCGGCTCCGAGGAGTCGGACGGCCTGCACCGGTTGGAGTG 3550 
AS P SGS E ES DG L H RLEW 
40 GCTCGCGGTCGCCGAGGCGGTCTACGACGGTGACCTGCCCGAGGGACATG 3600 
LAVAEAVYDGDLPEGH 
TCCTGATCACCGCCGCCCACCCCGACGACCCCGAGGACATACCCACCCGC 3650 
VLI TAAH PDDP EDI PTR 
GCCCACACCCGCGCCACCCGCGTCCTGACCGCCCTGCAACACCACCTCAC 3700 
45 AH T RATRVLTALQHHLT 

CACCACCGACCACACCCTCATCGTCCACACCACCACCGACCCCGCCGGCG 3750 

TTDHTLIVHTTTDPAG 
CCACCGTCACCGGCCTCACCCGCACCGCCCAGAACGAACACCCCCACCGC 3800 
ATVTGLTRTAQNEH PHR 
50 ATCCGCCTCATCGAAACCGACCACCCCCACACCCCCCTCCCCCTGGCCCA 3850 
IRLIETD HPHTPLPLAQ 
AC7CGCCACCCTCGACCACCCCCACCTCCGCCTCACCCACCACACCCTCC 3900 

LATLDHPHLRLTHHTL 
ACCACCCCCACCTCACCCCCCTCCACACCACCACCCCACCCACCACCACC 3950 
55 HHPHLTPLHTTTPPTTT 

CCCCTCAACCCCGAACACGCCATCATCATCACCGGCGGCTCCGGCACCCT 4000 

PLNPEHAI I ITGGSGTL 
CGCCGGCATCCTCGCCCGCCACCTGAACCACCCCCACACCT ACCTCCTCT 4 050 
AGILARHLNHPHT^YLL 
60 CCCGCACCCCACCCCCCGACGCCACCCCCGGCACCCACCTCCCCTGCGAC 4100 
SRT PPPDATPGTHL P C D 
GTCGGCGACCCCCACCAACTCGCCACCACCCTCACCCACATCCCCCAACC 4150 

VGDPHQLATTLTHIPQP - 
CCTC ACCGCCATCTTCCACACCGCCGCCACCCTCGACGACGGCATCCTCC 4200 
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LTA I FHTAATLDDGIL 
ACGCCCTCACCCCCGACCGCCTCACCACCGTCCTCCACCCCAAAGCCAAC 4 2 50 
HALT PDRLTTVLH PKAN 
GCCGCCTGGCACCTGCACCACCTCACCCAAAACCAACCCCTCACCCACTT 4 300 
5 AAWHLHHLTQNQPLTHF 

CGTCCTCTACTCCAGCGCCGCCGCCGTCCTCGGCAGCCCCGGACAAGGAA 4 350 

VL Y SSAAAVLGS PGQG 
ACTACGCCGCCGCCAACGCCTTCCTCGACGCCCTCGCCACCCACCGCCAC 4 4 00 
NYAAANAFLDALATHRH 
10 ACCCTCGGCCAACCCGCCACCTCCATCGCCTGGGGCATGTGGCACACCAC 4 4 50 
TT.GQPATSIAWGMWHTT 
CAGCACCCTCACCGGACAACTCGACGACGCCGACCGGGACCGCATCCGCC 4 500 

ST LTGQLDDADRDRIR 
GCGGCGGTTTCCTCCCGATCACGGACGACGAGGGCATGGGGATGCAT 
15 RGGFLPITDDEG 

Phage KC5 15 DNA was prepared using the procedure described in Genetic 
Manipulation of Streptomyces, A Laboratory Manual, edited by D. Hopwood et aL A 
phage suspension prepared from 10 plates (100 mm) of confluent plaques of KC515 on 

20 S. lividans TK24 generally gave about 3 \ig of phage DNA. The DNA was ligated to 
circularize at the cos site, subsequently digested with restriction enzymes BamUI and 
Pstl, and dephosphorylated with SAP. 

Each module 8 cassette described above was excised with restriction enzymes 
Bgtll and Nsil and ligated into the compatible BamHI and Pstl sites of KC515 phage 

25 DNA prepared as described above. The ligation mixture containing KC515 and various 
cassettes was transfected into protoplasts of Streptomyces lividans TK24 using the 
procedure described in Genetic Manipulation of Streptomyces , A Laboratory Manual 
edited by D. Hopwood et aL and overlaid with TK24 spores. After 16-24 hr, the plaques 
were restreaked on plates overlaid with TK24 spores. Single plaques were picked and 

30 resuspended in 200 \iL of nutrient broth. Phage DNA was prepared by the boiling 
method (Hopwood et aL, supra). The PCR with primers spanning the left and right 
boundaries of the recombinant phage was used to verify the correct phage had been 
isolated. In most cases, at least 80% of the plaques contained the expected insert. To 
confirm the presence of the resistance marker (thiostrepton), a spot test is used, as 

35 described in Lomovskaya et aL (1997), in which a plate with spots of phage is overlaid 
with mixture of spores of TK24 and phiC31 TK24 lysogen. After overnight incubation, 
the plate is overlaid with antibiotic in soft agar. A working stock is made of all phage 
containing desired constructs. 

Streptomyces hygroscopicus ATCC 14891 (see US Patent No. 3,244,592, issued 

40 5 Apr 1966, incorporated herein by reference) mycelia were infected with the 

recombinant phage by mixing the spores and phage (1 x 10 8 of each), and incubating on 
R2YE agar (Genetic Manipulation of Streptomyces, A Laboratory Manual, edited by D. 
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Hopwood et al.) at 30°C for 10 days. Recombinant clones were selected and plated on 
minimal medium containing thiostrepton (50 (ig/mi) to select for the thiostrepton 
resistance-conferring gene. Primary thiostrepton resistant clones were isolated and 
purified through a second round of single colony isolation, as necessary. To obtain 
thiostrepton-sensitive revertants that underwent a second recombination event to evict 
the phage genome, primary recombinants were propagated in liquid media for two to 
three days in the absence of thiostrepton and then spread on agar medium without 
thiostrepton to obtain spores. Spores were plated to obtain about 50 colonies per plate, 
and thiostrepton sensitive colonies were identified by replica plating onto thiostrepton 
containing agar medium. The PCR was used to determine which of the thiostrepton 
sensitive colonies reverted to the wild type (reversal of the initial integration event), and 
which contain the desired AT swap at module 8 in the ATCC 1489 1 -derived cells. The 
PCR primers used amplified either the KS/ AT junction or the AT/DH junction of the 
wild-type and the desired recombinant strains. Fermentation of the recombinant strains, 
followed by isolation of the metabolites and analysis by LCMS, and NMR is used to 
characterize the novel polyketide compounds. 

Example 2 

Replacement of Methoxyl with Hydrogen or Methyl at C-13 of FK-506 
20 The present invention also provides the 13-desmethoxy derivatives of FK-506 

and the novel PKS enzymes that produce them. A variety of Streptomyces strains that 
produce FK-506 are known in the art, including S. tsukubaensis No. 9993 (FERM BP- 
927), described in U.S. Patent No. 5,624,852, incorporated herein by reference; £ 
hygroscopicus subsp. yakushimaensis No. 7238, described in U.S. patent No. 4,894,366, 
25 incorporated herein by reference; S. sp. MA6858 (ATCC 55098), described in U.S. 

Patent Nos. 5,1 16,756, incorporated herein by reference; and S. sp. MA 6548, described 
in Motamedi et al. y 1998, 4 The biosynthetic gene cluster for the macrolactone ring of the 
immunosuppressant FK-506," Eur. J. Biochem. 256: 528-534, and Motamedi et al. 9 
1997, "Structural organization of a multifunctional polyketide synthase involved in the 
30 biosynthesis of the macrolide immunosuppressant FK-506," Eur. J. Biochem. 244: 74- 
80, each of which is incorporated herein by reference. 

The complete sequence of the FK-506 gene cluster from Streptomyces sp. 
MA6548 is known, and the sequences of the corresponding gene clusters from other FK- 
506-producing organisms is highly homologous thereto. The novel FK-506 recombinant 
35 gene clusters of the present invention differ from the naturally occurring gene clusters in 
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that the AT domain of module 8 of the naturally occurring PKSs is replaced by an AT 
domain specific for maionyl CoA or methylmalonyl CoA. These AT domain 
replacements are made at the DNA level, following the methodology described in 
Example 1. 

5 The naturally occurring module 8 sequence for the MA6548 strain is shown 

below, followed by the illustrative hybrid module 8 sequences for the MA6548 strains. 

GGATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 50 
MRLYEAARRTGSPVVV 
GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 
10 AAALDDAPDVPLLRGLR 

GCGTACGACCGTCCGGCGTG.CCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 

RTTVRRAAVRERSLAD 
GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
RSPCCPTTSAPTPPSRS 
1 5 TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 

SWNSTATVLGHLGAEDI 
CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 

PATTTFKELGI DSLTA 
TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
20 VQLRNALTTATGV RLNA 

ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 

T AVFDFPTPRALAARLG 
CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 450 
DELAGTRAPVAARTAA 
25 CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 

TAAAH DEPLAIVGMAC R 
CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

LPGGVAS PQELWRLVAS 
CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
30 GTDAI TEFPADRGWDV 

ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
DA LYDPDPDAIGKTFVR 
CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 700 
HGG FLDG'ATGFDAAFFG 
35 GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 750 

I S PREALAMDPQQRVL 
TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
LETSWEAFESAGITPDA 
GCGCGGGGCAGCGACACCGGCGTGTTeATCGGCGCGTTCTCCTACGGGTA 850 
40 ARG S DTGVFI GAFSYG Y. 

CGGCACGCGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 

GTGADTNGFGATGSQT 
GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
SVLSGRLSYFYGLEGPS 
45 GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 

VTVDTACSSSLVALHQA 
AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 

GQ S LR S G ECS LALVGG 
TCACGGTG ATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
50 VTVMAS PGG FVE FSRQR 

GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 

GLAPDGRAKAFGAGADG 
TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 
TS FAEGAGALVVERL S 
55 ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 

DAERHGHTVLALVRGSA 
GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

ANSDGASNGLSAPNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
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QERVI HQALANAKLTP 
CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 
ADVDAVEAHGTGTRLGD 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 14 50 
5 P I EAQALLATYGQDRAT 

GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 

PLLLGSLKSNIGHAQA 
CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
ASGVAGI IKMVQAIRHG 
10 GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 

ELPPTLHADEPSPHVDW 
GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 

TAGAVELLTSARPWPG 
CCGGTCGCCCGCGCCGCGCTGCCGTCTCGTCGTTCGGCGTGAGCGGCACG 17 00 
15 TGRPRRAAVSS FGVSGT 

AACGCCCACATCATCCTTGAGGCAGGACCGGTCAAAACGGGACCGGTCGA 17 50 

NAH I I LEAGPVKTGPVE 
GGCAGGAGCGATCGAGGCAGGACCGGTCGAAGT AGGACCGGTCGAGGCTG 1800 
A G A I EAGPVEVGPVEA 
20 GACCGCTCCCCGCGGCGCCGCCGTCAGCACCGGGCGAAGACCTTCCGCTG 1850 

GPLPAAPPSAPGEDLP L 
CTCGTGTCGGCGCGTTCCCCGGAGGCACTCGACGAGCAGATCGGGCGCCT 1900 

LVSARS PEALDEQIGRL 
GCGCGCCTATCTCGACACCGGCCCGGGCGTCGACCGGGCGGCCGTGGCGC 1950 
25 RAYLDTG PGVDRAAVA 

AGACACTGGCCCGGCGTACGCACTTCACCCACCGGGCCGTACTGCTCGGG 2000 
QTL'^RRTH FTHRAVLLG 
GACACCGTCATCGGCGCTCCCCCCGCGGACCAGGCCGACGAACTCGTCTT 2050 
DTVIGAP PADQADELVF 
30 ■ CGTCTACTCCGGTCAGGGCACCCAGCATCCCGCGATGGGCGAGCAACTCG 2100 
VYSGQGTQHPAMGEQL 
CGGCCGCGTTCCCCGTGTTCGCCGATGCCTGGCACGACGCGCTCCGACGG 2150 
AAAFPVFADAWH DALRR 
CTCGACGACCCCGACCCGCACGACCCCACACGGAGCCAGCACACGCTCTT 2200 
35 LDDPDPHDPTRSQHTLF 

CGCCC ACCAGGCGGCGTTCACCGCCCTCCTGAGGTCCTGGGACATCACGC 2250 

AHQAAFTAL- LRSWDIT 
CGCACGCCGTCATCGGCCACTCGCTCGGCGAGATCACCGCCGCGTACGCC 2300 
PHAVIGH. SLGEITAAYA 
40 GCCGGGATCCTGTCGCTCGACGACGCC7GCACCCTGATCACCACGCGTGC 2350 

AG I LSLDDACTLITTRA 
CCGCCTCATGCACACGCTTCCGCCGCCCGGCGCCATGGTCACCGTGCTGA 24 00 

RLMHTLPP PGAMVTVL 
CCAGCGAGGAGGAGGCCCGTCAGGCGCTGCGGCCGGGCGTGGAGATCGCC 24 50 
45 TSEEEARQALRPGVEIA 

GCGGTCTTCGGCCCGCACTCCGTCGTGCTCTCGGGCGACGAGGACGCCGT 2500 

AVFGPHSVVLSGDEDAV 
GCTCGACGTCGCACAGCGGCTCGGCATCCACCACCGTCTGCCCGCGCCGC 2550 
LDVAQRLGIHHRLPAP 
50 ACGCGGGCCACTCCGCGCACATGGAACCCGTGGCCGCCGAGCTGCTCGCC 2 600 

HAG HSAHMEPVAAELLA 
ACCACTCGCGAGCTCCGTTACGACCGGCCCCACACCGCCATCCCGAACGA 2 650 

TTRELR'YDRPHTAIPND 
CCCCACCACCGCCGAGTACTGGGCCGAGCAGGTCCGCAACCCCGTGCTGT 27 00 
55 PTTAEYWAEQVRNPVL 

TCCACGCCCACACCCAGCGGTACCCCGACGCCGTGTTCGTCGAGATCGGC 2750 
FHAHTQRYPDAVFVEIG 
CCCGGCCAGGACCTCTCACCGCTGGTCGACGGCATCGCCCTGCAGAACGG 2800 
PGQDLS PLVDG IALQNG 
60 CACGGCGGACGAGGTGCACGCGCTGCACACCGCGCTCGCCCGCCTCTTCA 28 50 

T ADEVHALHTALARLF 
CACGCGGCGCCACGCTCGACTGGTCCCGCATCCTCGGCGGTGCTTCGCGG 2900 
TRGA T LDWS RI LGGASR 
CACGACCCTGACGTCCCCTCGTACGCGTTCCAGCGGCGTCCCTACTGGAT 2950 
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H DPD VPSYAFQRRPYWI 
CGAGTCGGCTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCCTCGGCA 3000 

ESAPPATADSGHPVLG 
CCGGAGTCGCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTG 3050 
5 TGVAVAGS PGRVFTGPV 

CCCGCCGGTGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGC 3100 

PAGADRAVFIAELALAA 
CGCCGACGCCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCG 3150 
ADATDCATVEQLDVTS 
1 0 TGCCCGGCGGATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGAT 3200 

VPGGSARGRATAQTWVD 
GAACCCGCCGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGG 3250 

E PAADGRRRFTVHTRVG 
CGACGCCCCGTGGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCG 3300 
15 DAPWTLHAEGVLRPGR 

TGCCCCACCCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTG 3350 
VPQPEAVDTAWPPPGAV 
CCCGCGGACGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGT 34 00 
PADGL PGAWRRADQVFV 
20 CGAAGCCGAAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGC 34 50 

EAEVDS PDGFVAHPDL 
TCGACGCGGTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGA 3500 
LDAVFSAV- GDGSRQPTG 
TGGCGCGACCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTG 3550 
25 WR DLAVHASDATVLRAC 

CCTCACCCGCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTG 3600 

LTRRDSGVVELAAFDG 
CCGGAATGCCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCG 3650 
AGMPVLTAESVTLGEVA 
30 TCGGCAGGCGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTT 3700 

SAGGS DESDGLLRLEWL 
GCCGGTGGCGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCT 37 50 

PVAEAHYDGADELPEG 
ACACCCTCATCACCGCCACACACCCCGACGACCCCGACGACCCCACCAAC 3800 
35 YTLITATHPDDPD .DPTN 

CCCCACAACACACCCACACGCACCCACACACAAACCACACGCGTCCTCAC 3850 

PHNTPTRTHTQTTRVLT 
CGCCCTCCAACACCACCTCATCACCACCAACCACACCCTCATCGTCCACA 3900 
ALQHHLITTNHTLIVH 
40 CCACCACCG ACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCA 3950 

TTTDPPGAAVTGLTRTA 
CAAAACGAACACCCCGGCCGCATCCACCTCATCGAAACCCACCACCCCCA 4000 

QNEHPGRI HLIETHHPH 
CACCCCACTCCCCCTCACCCAACTCACCACCCTCCACCAACCCCACCTAC 4 050 
45 TPLPLTQLTTLHQPHL 

GCCTCACCAACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACC 4100 
RLTNNTLHTPHLTPITT 
CACCACAACACCACCACAACCACCCCCAACACCCCACCCCTCAACCCCAA 4150 
HHNTTTTTPNTPPLNPN 
50 CCACGCCATCCTCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCG 4200 

HAI LI TGGSGTLAG IL 
CCCGCCACCTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCA 4 250 
ARHLNHPHTYLLSRTPP 
CCCCCCACCACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCAC 4 300 
55 PPTTPGTHIPCDLTDPT 

CCAAATCACCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCT 4 350 

QITQALTHIPQPLTGI 
TCCACACCGCCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCC 4 400 
FHTAATLDDATLTNLTP 
CAACACCTCACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCT 4 450 

QHLTTTLQPKADAAWHL 
CCACCACCACACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCA 4 500 

HHHTQNQPLTHFVLYS 
GCGCCGCCGCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCC 4 550 
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SAAATLGSPGQANYAAA 
AACGCCTTCCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACC 4 600 

NAFLDALATHRHTQGQP 
CGCCACCACCATCGCCTGGGGCATGTGGCACACCACCACCACACTCACCA 4 650 
5 ATT IAWGMWHTTTTLT 

GCCAACTCACCGAC AGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTG 4 700 

SQLTDSDRDRIRRGGFL 

CCGATCTCGGACGACGAGGGCATGC 

PISDDEGM 

10 

The Avr\\-Xho\ hybrid FK-506 PKS module 8 containing the AT domain of 
module 12 of rapamycin is shown below. 

GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 5 0 
MRLYEAARRTGSPVVV 
15 GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 
AAALDDAPDV PLLRGLR 
GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 

RTTVRRAAVRERSLAD 
GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
20 RSPCCPTTSAPTPPSRS 

TCCTGG AACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 

SWNSTATVLGHLGAEDI 
CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 
PATTTFKELG I DSLTA 
25 TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
VQLRNALTTATGVRLNA 
ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 

TAVFDFPTPRALAARLG 
CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 4 50 
30 DELAGTRAPVAARTAA 

CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
TAAAH DE P LAIVGMACR 
CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 
LPGGVASPQELWRLVAS 
35 CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
G TDAITEFPADRGWDV 
ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
DALYDPDPDAIGKTFVR 
CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 700 
40 HGG FL DGATG FD AAFFG 

GATCAGCCCGCGCGAGGCCCTGGCCATGG ACCCGCAGCAACGGGTGCTCC 7 50 

I S PREA LAMDPQQRVL 
TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
LETSWEAFESAGITPDA 
45 GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 8 50 
ARGSDTGVFIGAFSYGY 
CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 
GTGADTNGFGATGSQT 
. GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
50 SVLSGRLSYFYGLEGPS 

GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 

V.7VDT ACSSSLVALHQA 
AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 
GQSLRSGE'CSLALVGG 
55 TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
VTVMAS PGG FVEFSRQR 
GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 

GLAPDGRAKAFGAGADG 
TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 
60 TSFAEGAGALVVERLS 

ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 
DAERHGHTVLALVRGSA 
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GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

ANSDGASNGLSAPNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
Q E R V I HQALANAKLTP 
5 CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 1400 
ADVDAVEAHGTGTRLGD 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 14 50 

P I EAQALLATYGQDRAT 
GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 
10 PLLLGSLKSNIGHAQA 

CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
ASGVAG I I KMVQAI RHG 
GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 
ELP PTLHADEPS PHVDW 
15 GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 
TAGAVELLTSARPWPG 
CCGGTCGCCCTAGGCGGGCAGGCGTGTCGTCCTTCGGGATCAGTGGCACC 1700 
TGRPRRAGVSSFG I SGT 
AACGCCCACGTCATCCTGGAAAGCGCACCCCCCACTCAGCCTGCGGACAA 1750 
20 NAHVI LESAPPTQPAD.N 

CGCGGTGATCGAGCGGGCACCGGAGTGGGTGCCGTTGGTGATTTCGGCCA 1800 

AVIERAPEWVPLVISA 
GGACCCAGTCGGCTTTGACTGAGCACGAGGGCCGGTTGCGTGCGTATCTG 1850 
RTQSALTEHEGRLRAYL 
25 GCGGCGTCGCCCGGGGTGGATATGCGGGCTGTGGCATCGACGCTGGCGAT 1 900 
AASPGVDMRAVASTLAM 
GACACGGTCGGTGTTCGAGCACCGTGCCGTGCTGCTGGGAGATGACACCG 1950 

TRSVFEHRAVLLGDDT 
TCACCGGCACC GCTGTGTCTGACCCTCGGGCGGTGTTCGTCTTCCCGGGA 2000 
30 VTGTAVS DPRAVFVFPG 

CAGGGGTCGCAGCGTGCTGGCATGGGTGAGGAACTGGCCGCCGCGTTCCC 2050 

QGSQRAGMGEELAAAFP 
CGTCTTCGCGCGGATCCATCAGCAGGTGTGGGACCTGCTCGATGTGCCCG 2100 
VFARI HQQVWDLLDVP 
35 ATCTGGAGGTGAACGAGACCGGTTACGCCCAGCCGGCCCTGTTCGCAATG 2150 
DLEVNETGYA.QPALFAM 
CAGGTGGCTCTGTTCGGGCTGCTGGAATCGTGGGGTGTACGACCGGACGC 2200 

QVALFG LLESWGVRPDA 
GGTGATCGGCCATTCGGTGGGTGAGCTTGCGGCTGCGTATGTGTCCGGGG 2250 
40 VI GHSVGELAAAYVSG 

TGTGGTCGTTGGAGGATGCCTGCACTTTGGTGTCGGCGCGGGCTCGTCTG 2 300 
VWS LEDACTL VSARARL 
ATGCAGGCTCTGCCCGCGGGTGGGGTGATGGTCGCTGTCCCGGTCTCGGA 2350 
MQALPAGGVMVAVPVSE 
45 GGATG AGGCCCGGGCCGTGCTGGGTG AGGGTGTGG AG ATCGCCGCGGTC A 2400 
DEARAVLGEGVE IAAV 
ACGGCCCGTCGTCGGTGGTTCTCTCCGGTG ATGAGGCCGCCGTGCTGCAG 24 50 
NGPSSVVLSGDEAAVLQ 
GCCGCGGAGGGGCTGGGGAAGTGGACGCGGCTGGCGACCAGCCACGCGTT 2500 
50 AAEGLGKWTRLATSHAF 

CC ATTCCGCCCGT ATGGAACCCATGCTGGAGGAGTTCCGGGCGGTCGCCG 2550 

H S ARME.P ML EE F RAVA 
AAGGCCTGACCTACCGGACGCCGCAGGTCTCCATGGCCGTTGGTGATCAG 2600 
EGLTYRT PQVSMAVGDQ 
55 GTGACCACCGCTGAGTACTGGGTGCGGCAGGTCCGGGACACGGTCCGGTT 2 650 
VTTAEYWVRQVRDTVRF 
CGGCGAGCAGGTGGCCTCGTACGAGGACGCCGTGTTCGTCGAGCTGGGTG 2700 

GEQVASYEDAVFVELG 
CCGACCGGTCACTGGCCCGCCTGGTCGACGGTGTCGCGATGCTGCACGGC 2750 
60 ADRS LARLVDGVAMLHG 

GACCACGAAATCCAGGCCGCGATCGGCGCCCTGGCCCACCTGTATGTCAA 2800 

DH E IQAAIGALAH LYVN 
CGGCGTCACGGTCGACTGGCCCGCGCTCCTGGGCGATGCTCCGGCAACAC 28 50 
GVTVDWPALLGDAPA T 
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GGGTGCTGGACCTTCCGACATACGCCTTCCAGCACCAGCGCTACTGGCTC 2 900 
RVLDLPTYAFQHQRYWL 
GAGTCGGCTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCCTCGGCAC 2950 
E S A P PATADSGHPVLGT 
5 CGGAGTCGCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTGC 3000 
GVAVAGS PGRVFTGPV 
CCGCCGGTGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGCC 3050 
PAGADRAVFIAELALAA 
GCCGACGCCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCGT 3100 
10 ADAT DCATVEQLDVTSV 

GCCCGGCGGATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGATG .3150 

PGG SARGRATAQTWVD 
AACCCGCCGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGGC 3200 
EPAADGRRRFTVHTRVG 
15 GACGCCCCGTGGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCGT 32 50 
DAPWTLHAEGVLRPGRV 
GCCCCAGCCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTGC 3300 

PQPEAVDTAWPPPGAV 
CCGCGG ACGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGTC 3350 
20 PADGLPGAWRRADQVFV 

GAAGCCGAAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGCT 34 00 

EA E V D S P DG.FVAH P DLL 
CGACGCGGTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGAT 34 50 
DAV FSAVG DGS RQ P TG 
25 GGCGCGACCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTGC 3500 
WRDLAVHASDATVLRAC 
CTCACCCGCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTGC 3550 

LTRRDSGVVELAAFDGA 
CGGAATGCCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCGT 3600 
30 G. MPVLTAESVTLGEVA 

CGGC AGGCGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTTG 3650 
SAGGSDESDGLLRLEWL 
CCGGTGGCGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCTA 3700 
PVAEAHYDGADELPEGY 
35 CACCCTCATCACCGCCACACACCCCGACGACCCCGACGACCCCACCAACC 3750 
TLI TATHPDDPDDPTN 
CCCACAACACACCCACACGCACCCACACACAAACCACACGCGTCCTCACC 3800 
PHNT PTRTHTQTTRVLT 
GCCCTCCAACACCACCTCATCACCACCAACCACACCCTCATCGTCCACAC 38 50 
40 ALQHHLITTNHTLTVHT 

CACCACCGACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCAC 3 900 

TTDPPGAAVTGLTRTA 
AAAACGAACACCCCGGCCGCATCCACCTCATCGAAACCCACCACCCCCAC 3950 
QNEH PGRIHLIETHHPH 
45 ACCCCACTCCCCCTCACCCAACTCACCACCCTCCACCAACCCCACCTACG 4 000 
TP L PLTQLTTLHQPHLR 
CCTCACCAACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACCC 4 050 

LTNNTLHTPHLTPITT 
ACCACAACACCACCACAACCACCCCCAACACCCCACCCCTCAACCCCAAC 4 100 
50'HHNTTTTTPNTPPLNPN 

CACGCCATCCTCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGC 4150 

H A I L ITGGSGTLAGI LA 
CCGCCACCTCAACCACCCCCACACCTACCTCCTCTCCCGCACACC ACCAC 4 200 
RHLNHPHTYLLSRTPP 
55 CCCCCACCACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCACC 4 250 
PPTTPGTHI PCDLTDPT 
CAAATCACCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCTT 4 300 

QITQALTHI PQPLTGIF 
CCACACCGCCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCCC 4 3 50 
60 HTAATLDDATLTNLTP 

AACACCTCACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCTC 4 4 00 
QHLTTTLQPK ADAAWHL 
CACCACCACACCCAAAACC AACCCCTCACCCACTTCGTCCTCTACTCCAG 4 4 50 
H H H TQNQ PLTH FVLYSS 
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CGCCGCCGCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCCA 4 500 

AAATLGS PGQANYAAA 
ACGCCTTCCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACCC 4 550 
NAFLDALATHRHTQGQP 
5 GCCACCACCATCGCCTGGGGCATGTGGCACACCACCACCACACTCACCAG 4 600 
ATT IAWGMWHTTTTLTS 
CCAACTCACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTGC 4 650 

QLT DS DRDR I R R G G F L • 
CGATCTCGGACGACGAGGGCATGC 
10 PI SDDEGM 

The AvrU-Xhol hybrid FK-506 PKS module 8 containing the AT domain of 
module 13 of rapamycin is shown below. 

GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 50 
15 MR LYEAARRTGSPVVV 

GCGGOCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 

AAA LDDAPDVPLLRGLR 
GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 
RTTVRRAAVRERSLAD 
20 GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
RSPCCPTTSAPTPPSRS 
TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 

SWN STATVLGHLGAEDI 
CCCGGCGACGACGACGTTCAAGG AACTCGGCATCGACTCGCTCACCGCGG 300 
25 PATTTFKELGIDSLTA 

TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGT ACGCCTCAACGCC 350 
VQL RNALTTATGVRLNA 
ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 
TAV FDFPT PRALAARLG 
30 CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 4 50 
DELAGTRAPVAARTAA 
CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
TAAAH DE PLAIVGMACR 
CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 
35 L ° G G V A S PQELWRLVAS 

CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 

GTDAITEFPADRGWDV 
ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
DAL YDPDPDAIG KTFVR 
40 CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 700 
HGGFLD'GATGFDAAFFG 
GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 750 

IS PREALAMDPQQRVL 
TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
45 LETSWEAFESAG ITPDA 

GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 850 

ARGSDTGVFIGAFSYGY 
CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 
GTGADTNGFGATGSQT 
50 GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
S. VLSGRLSYFYGLEGPS 
GTCACGGTCG ACACCGCCTGCTCGTCGTC ACTGGTCGCCCTGCACCAGGC 1000 

VTVDTACS SSLVALHQA 
AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 
55 GQS LRSGECSLALVGG 

TCACGGTGATGGCGTCGCCCGGCGGATTCGTCG AGTTCTCCCGGCAGCGC 1100 
VTVMAS PGGFVEFSRQR 
GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 
GLA PDGRAKAF GAGADG 
60 TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 
TS FAEGAGALVVERLS 
ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1 2 50 
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DAER HGHTVLALVRGSA 
GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

ANSDGASNGLSAPNGPS 
CC AG G AAC G CGT C ATC CACC AGG C CCTCGCGAACGCGAAACTC AC CCCCG 1350 
5 QERV I HQALANAKLTP 

CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 
ADV DAVEAHGTGTRLGD 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 14 50 
PI EAQALLATYGQDRAT 
10 GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 
PLLLGS LKSNIGHAQA 
CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
ASGVAGI I KMVQAI RHG 
GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 
15 ELPPTLHADEPSPHVDW 

GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 

TAGAVE L LTSARPWPG 
CCGGTCGCCCTAGGCGGGCGGGCGTGTCGTCCTTCGGAGTCAGCGGCACC 1700 
TGR PRRAGVSSFGVSGT 
20 AACGCCCACG1 CATCCTGGAGAGCGCACCCCCCGCTCAGCCCGCGGAGGA 1750 
NAHVILESAPPAQPAEE 
GGCGCAGCCTGTTGAGACGCCGGTGGTGGCCTCGGATGTGCTGCCGCTGG 1800 

AQPVETPVVASDVLPL 
TGAT ATCGGCCAAGACCC AGCCCGCCCTGACCGAACACGAAGACCGGCTG 1850 
25 V I SAKTQPALTEHEDR L 

CGCGCCTACCTGGCGGCGTCGCCCGGGGCGGATATACGGGCTGTGGCATC 1900 

R .AYLAAS PGAD I RAVAS 
GACGCTGGCGGTGACACGGTCGGTGTTCGAGCACCGCGCCGTACTCCTTG 1950 
TLAVTRSVFEHRAVLL 
30 GAGATGACACCGTCACCGGCACCGCGGTGACCGACCCCAGGATCGTGTTT 2000 
GDDTVTGTA VTDPRIVF 
GTCTTTCCCGGGCAGGGGTGGCAGTGGCTGGGGATGGGCAGTGCACTGCG 2050 

VFPGQGWQWLGMGSALR 
CGATTCGTCGGTGGTGTTCGCCGAGCGGATGGCCGAGTGTGCGGCGGCGT 2100 
35 DS SVVFAERMAECAAA 

TGCGCGAGTTCGTGGACTGGGATCTGTTCACGGTTCTGGATGATCCGGCG 2150 
LRE FVDWDLFTVLDDPA 
GTGGTGGACCGGGTTGATGTGGTCCAGCCCGCTTCCTGGGCGATGATGGT 2200 
VVDRVDVVQPASWAMMV 
40 TTCCCTGGCCGCGGTGTGGCAGGCGGCCGGTGTGCGGCCGGATGCGGTGA 2250 
SLAAVWQAAGVRPDAV 
TCGGCCATTCGCAGGGTGAGATCGCCGCAGCTTGTGTGGCGGGTGCGGTG 2300 
I G H S Q G E I A A A C V AG AV 
TCACTACGCGATGCCGCCCGGATCGTGACCTTGCGCAGCCAGGCGATCGC 2350 
45 SLRDAARIVTLRSQAIA 

CCGGGGCCTGGCGGGCCGGGGCGCGATGGCATCCGTCGCCCTGCCCGCGC 2 400 

RGLAGRGAMASVALPA 
AGGATGTCGAGCTGGTCGACGGGGCCTGGATCGCCGCCCACAACGGGCCC 24 50 
QDVELVDGAWIAAHNGP 
50 GCCTCCACCGTGATCGCGGGCACCCCGGAAGCGGTCGACCATGTCCTCAC 2 500 
ASTVIAGTPEAVDHVLT 
CGCTCATGAGGCACAAGGGGTGCGGGTGCGGCGGATC ACCGTCGACTATG 2550 

AH EAQGVRVRRITVDY 
CCTCGCACACCCCGCACGTCGAGCTGATCCGCGACGAACTACTCGACATC 2 600 
55 ASHTPHVELIRDELLDI 

ACTAGCGACAGCAGCTCGCAGACCCCGCTCGTGCCGTGGCTGTCGACCGT 2 650 

TSDSSSQTPLVPWLSTV 
GGACGGCACCTGGGTCGACAGCCCGCTGGACGGGGAGTACTGGTACCGGA 2700 
DGTWVDS PLDGEYWYR 
60 ACCTGCGTGAACCGGTCGGTTTCCACCCCGCCGTCAGCCAGTTGCAGGCC 2750 
NLREPVG FHPAVS'QLQA 
CAGGGCGACACCGTGTTCGTCGAGGTCAGCGCCAGCCCGGTGTTGTTGCA 2800 

QGDTVFVEVSA S P VLLQ 
GGCGATGGACGACGATGTCGTCACGGTTGCCACGCTGCGTCGTGACGACG 28 50 
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AMDDDV VTVATLRRDD 
GCGACGCCACCCGGATGCTCACCGCCCTGGCACAGGCCTATGTCCACGGC 2900 
GDATRMLTALAQAYVHG 
GTCACCGTCGACTGGCCCGCCATCCTCGGCACCACCACAACCCGGGTACT 2950 
5 VTVDWPAILGTTTTRVL 

GGACCTTCCGACCTACGCCTTCCAACACCAGCGGTACTGGCTCGAGTCGG 3000 

DLPTYAFQHQRYWLES 
CTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCCTCGGCACCGGAGTC 3050 
APPATADSGHPVLGTGV 
10 GCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTGCCCGCCGG 3100 
AVAGS PGRVFTGPVPAG 
TGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGCCGCCGACG 3150 

ADRAV FIAELALAAAD 
CCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCGTGCCCGGC 3200 
15 ATDCATVEQLDVTSVPG 

GGATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGATGAACCCGC 3250 

GSARGRATAQTWVDEPA 
CGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGGCGACGCCC 3300 
ADGRRRFTVHTRVGDA 

20 

CGTGGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCGTGCCCCAG 3350 
PWTLHAEGVL RPGRVPQ 
CCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTGCCCGCGGA 34 00 

P EAV DTAW P P PGAV P AD 
CGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGTCGAAGCCG 34 50 
25 GLPGAWRR AOQVFVEA 

AAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGCTCGACGCG 3500 
EVDS PDGFVAHPDLLDA 
GTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGATGGCGCGA 3550 
VFSAVGDGSRQPTGWRD 
30 CCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTGCCTCACCC 3600 
LAVHAS DATVLRACLT 
GCCGCGACAGT GGTGTCGTGGAGCTCGCCGCCTTCGACGGTGCCGGAATG 3650 
RRDSGVVELAAFDGAGM 
CCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCGTCGGCAGG 37 00 
35 PVLTAESVTLGEVASAG 

CGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTTGCCGGTGG 37 50 

GSDESDGLLRLEWLPV 
CGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCTACACCCTC 38 00 
AEAHYDGADELPEGYTL 
40 ATCACCGCCACACACCCCGACGACCCCGACGACCCCACCAACCCCCACAA 38 50 
ITATHPDDPDDPTNPHN 
CACACCCACACGCACCCACACACAAACCACACGCGTCCTCACCGCCCTCC 3900 

T PTRT H TQTTRVLTAL 
AACACCACCTCATCACCACCAACCACACCCTCATCGTCCACACCACCACC 3950 
45 QHH-LITTNHTLIVHTTT 

G ACCCCCC AGGCGCCGCCGTC ACCGGCCTC ACCCGCACCGCAC AAAACGA 4 000 

D PPGAAVTGLT RTAQNE 
ACACCCCGGCCGCATCCACCTCATCGAAACCCACCACCCCCACACCCCAC 4 050 
HPGRI HLIETHHPHTP 
50 TCCCCCTCACCCAACTCACCACCCTCC ACCAACCCCACCTACGCCTCACC 4100 
LPLTQLTTLHQPHLRLT 
AACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACCCACCACAA 4150 

NNTLHTPHLTPITTHHN 
CACCACC ACAACCACCCCCAAC ACCCC ACCCCTCAACCCCAACCACGCCA 4 200 
55 TTTT T PNT P PLN PNHA 

TCCTC ATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCAC 4 250 
I LI TGGSGTLAGI LARH 
CTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCACCCCCCAC 4 300 
L NHPHT YLLSRTPPPPT 
60 CACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCACCCAAATCA 4 350 
TPGTHI PCDLTDPTQI 
CCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCTTCCACACC 4 4 00 
TQALTH I*PQPLTG I FHT 
GCCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCCCAACACCT 4 4 50 
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AaTLD.DATLTNLTPQHL 
CACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCTCCACCACC 4 500 

TTTLQPKADAAWHLHH 
ACACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCCGCC 4 550 
5 HTQNQPLTHFVLYSSAA 

GCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCCAACGCCTT 4 600 

ATLGS PGQANYAAANAF 
CCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACCCGCCACCA 4 600 
LDALATHRHTQGQPAT 
1 0 CCATCGCCTGGGGCATGTGGCACACCACCACCACACTCACCAGCCAACTC 4 700 
T IAWGMWHTTTTLTSQL 
ACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTGCCG ATCTC 4 7 50 

TDS DRDRIRRGGFLPIS 
GGACGACGAGGGCATGC 
15 D D E G M 

The NheUXhol hybrid FK-506 PKS module 8 containing the AT domain of 
module 12 of rapamycin is shown below. 

GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 50 
20 MR L Y EAARRTGS PVVV 

GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 

AAALDDAPDVPLLRGLR 
GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 
RTTVRRAAVRERSLAD 
25 GCTCGCCGTGCTGCCCG ACGACG AGCGCGCCGACGCCTCCCTCGCGTTCG 200 
RSPCCPTTSAPTPPSRS 
TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 

SWNSTATVLGHLGAEDI 
CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 
30 PATTTFKELGI DSLTA 

TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
VQLR NALTTATGVRLNA 
ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 
TAV FDFPT PRAL AARLG 
35 CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 4 50 
DELAGTRAPVAARTAA 
CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
TAAAH DE PLAIVGMACR 
CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 
40 LPGGVASPQEL WRLVAS 

CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 

GTDAITEFPADRGWDV 
ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
DALYDPDPDAIGKTFVR 
45 CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 7 00 
HGG FLDGATGFDAAFFG 
GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 7 50 

IS PREALAMDPQQRVL 
TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
50 L ET S WEA F E SAG I T P DA 

GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 8 50 

ARGSDTGVFIGAFSYGY 
CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 
GTGADTNGFGATGSQT 
55 GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
SVLSGRLS YFYGLEGPS 
GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 

VTVDTACSSSLVALHQA 
AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 
60 GQS LRSGECSLALVGG 

TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
VTVMAS PGGFVEFSRQR 

99 



BNSDOCID: <WO 0020601 A2 ) > 



WO 00/20601 



PCT/US99/22886 



GGGC T CGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 

GLAPDGRAKAFGAGADG 
TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 
TS FAEGAGALVVERLS 
5 ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 
DAERHGHTVLALVRGSA 
GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

ANSDGASNGLSAPNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
10 QE RV I H QALA.NAKLT P 

CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 
A D V DAV E A H GTG T R'LG D 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 14 50 
PI EAQALLATYGQDRAT 
1 5 GCCCCTGCTGC rCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 
PLLLGS L'KSNIGHAQA 
CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
ASGVAGI IKMVQAIRHG 
GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 
20 E T .PPTLHADEPS PHVDW 

GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1 650 

TAGAVELLTSARPWPG 
CCGGTCGCCCGCGCCGCGCTGCCGTCTCGTCGTTCGGCGTGAGCGGCACG 1700 
TG R PRRAAVSS FGVSG.T 
25 AACGCCCACATCATCCTTGAGGCAGGACCGGTCAAAACGGGACCGGTCGA 1750 
N AH I I LEAGPVKTGPVE 
GGCAGGAGCGATCGAGGCAGGACCGGTCGAAGTAGGACCGGTCGAGGCTG 1800 

AGAI EAGPVEVG PVEA 
GACCGCTCCCCGCGGCGCCGCCGTCAGC ACCGGGCG AAGACCTTCCGCTG 18 50 
30 G PL PAA 'P PSAPGEDLPL 

CTCGTGTCGGCGCGTTCCCCGGAGGCACTCGACGAGCAGATCGGGCGCCT 1 900 

LVSARS PEALDEQIGRL 
GCGCGCCTATCTCGACACCGGCCCGGGCGTCGACCGGGCGGCCGTGGCGC 1 950 
RAYLDTG PGVDRAAVA 
3 5 AG ACACTGGCCCGGCGT ACGC ACTTCACCCACCGGGCCGTACTGCTCGGG 2000 
QTLARRT H FTHRAVLLG 
GACACCGTCATCGGCGCTCCCCCCGCGGACCAGGCCGACGAACTCGTCTT 2050 

DTVIGAPPADQADELVF 
CGTCTACTCCGGTCAGGGCACCCAGCATCCCGCGATGGGCGAGCAGCTAG 2100 
40 VYSGQGTQHPAMGEQL 

CCGCCGCGTTCCCCGTCTTCGCGCGGATCCATCAGCAGGTGTGGGACCTG 2150 
AAA FPVFARI HQQVWDL 
CTCG ATGTGCCCGATCTGGAGGTGAACGAGACCGGTTACGCCCAGCCGGC 2200 
LDVPDLEVNETGYAQPA 
45 CCTGTTCGCAATGCAGGTGGCTCTGTTCGGGCTGCTGGAATCGTGGGGTG 2250 
LFAMQVALFGLLESWG 
TACGACCGGACGCGGTGATCGGCCATTCGGTGGGTGAGCTTGCGGCTGCG 2300 
VR p DAVI GHSVGELAAA 
TATGTGTCCGGGGTGTGGTCGTTGGAGGATGCCTGCACTTTGGTGTCGGC 2350 
50 YVSGVWSLEDACTLVSA 

GCGGGCTCGTCTGATGCAGGCTCTGCCCGCGGGTGGGGTGATGGTCGCTG 24 00 

RARLMQALPAGGVMVA 
TCCCGGTCTCCGAGGATGAGGCCCGGGCCGTGCTGGGTGAGGGTGTGGAG 24 50 
VPVSEDEARAVLGEGVE. 
55 ATCGCCGCGGTCAACGGCCCGTCGTCGGTGGTTCTCTCCGGTGATGAGGC 2500 
IAAVNG PSSVVLSGDEA 
CGCCGTGCTGCAGGCCGCGGAGGGCCTGGGGAAGTGGACGCGGCTGGCGA 2550 

AVLQAAEGLGKWTRLA 
CCAGCCACGCGTTCCATTCCGCCCGTATGGAACCCATGCTGGAGGAGTTC 2 600 
60 TSHAFHSARMEPMLEEF 

CGGGCGGTCGCCGAAGGCCTGACCTACCGGACGCCGCAGGTCTCCATGGC 2650 

RAVAEG LTYRTPQVSMA 
CGTTGGTGATCAGGTGACCACCGCTGAGTACTGGGTGCGGCAGGTCCGGG 2700 
VGDQVTTAEYWVRQVR 
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ACACGGTCCGGTTCGGCGAGCAGGTGGCCTCGTACGAGGACGCCGTGTTC 27 50 
DTVRFGEQVASYEDAVF 
GTCGAGCTGGGTGCCGACCGGTCACTGGCCCGCCTGGTCGACGGTGTCGC 2 800 
VELGADRSLARLVDGVA 
5 GATGCTGCACGGCGACCACG AAATCCAGGCCGCGATCGGCGCCCTGGCCC 28 50 
MLHG DHEIQAAIGALA 
ACCTGTATGTCAACGGCGTCACGGTCGACTGGCCCGCGCTCCTGGGCGAT 2900 
HLYVNGVTVDWPALLGD 
GCTCCGGCAACACGGGTGCTGGACCTTCCGACATACGCCTTCCAGCACCA 2 950 
10 APATRVLDLPTYAFQHQ 

GCGCTACTGGCTCGAGTCGGCTCCCCCGGCCACGGCCGACTCGGGCCACC 3000 

RYWLESAP PATADSGH 
CCGTCCTCGGCACCGGAGTCGCCGTCGCCGGGTCGCCGGGCCGGGTGTTC 3050 
PVLGTGVAVAGSPGRVF 
15 ACGGGTCCCGTGCCCGCCGGTGCGGACCGCGCGGTGTTCATCGCCGAACT 3100 
TGPVPAGADRAVFIAEL 
GGCGCTCGCCGCCGCCGACGCCACCGACTGCGCCACGGTCGAACAGCTCG 3150 

ALAAADATDCATVEQL 
ACGTCACCTCCGTGCCCGGCGGATCCGCCCGCGGCAGGGCCACCGCGCAG 3200 
20 DVTSVPGGSARGRATAQ 

ACCTGGGTCGATGAACCCGCCGCCGACGGGCGGCGCCGCTTCACCGTCGA 3250 

TWVDEPAADGRRRFTVH 
CACCCGCGTCGGCGACGCCCCGTGGACGCTGCACGCCGAGGGGGTTCTCC 3300 
TRVGDAPWTLHAEGVL 
25 GCCCCGGCCGCGTGCCCCAGCCCGAAGCCGTCGACACCGCCTGGCCCCCG 3350 
RPGRVPQPEAVDTAWPP 
CCGGGCGCGGTGCCCGCGGACGGGCTGCCCGGGGCGTGGCGACGCGCGGA 34 00 

PGAVPADGLPGAWRRAD 
CCAGGTCTTCGTCGAAGCCGAAGTCGACAGCCCTGACGGCTTCGTGGCAC 34 50 
30 -QVFVEAEVDSPDGFVA 

ACCCCGACCTGCTCGACGCGGTCTTCTCCGCGGTCGGCGACGGGAGCCGC 3500 
HPDLLDAVFSAVGDGSR 
CAGCCGACCGGATGGCGCGACCTCGCGGTGCACGCGTCGGACGCCACCGT 3550 
QPTGWRDLAVHASDATV 
35 GCTGCGCGCCTGCCTCACCCGCCGCGACAGTGGTGTCGTGGAGCTCGCCG 3600 
LRACLTRRDSGVVELA 
CCTTCGACGGTGCCGGAATGCCGGTGCTCACCGCGGAGTCGGTGACGCTG 3650 
AFDGAGMPVLTAESVTL 
GGCGAGGTCGCGTCGGCAGGCGGATCCGACGAGTCGGACGGTCTGCTTCG 37 00 
40 GEVASAGGS DESDGLLR 

GCTTGAGTGGTTGCCGGTGGCGGAGGCCCACTACGACGGTGCCGACGAGC 3750 

LEWLPVAEAHYDGADE 
TGCCCGAGGGCTACACCCTCATCACCGCCACACACCCCGACGACCCCGAC 3800 
LPEGYTLITATHPDDPD 
45 G ACCCCACCAACCCCC ACAACACACCCACACGC ACCCAC AC ACAAACCAC 3850 
DPTNPHNTPTRTHTQTT 
ACGCGTCCTCACCGCCCTCCAACACCACCTCATCACCACCAACCACACCC 3900 

RV LT ALQH H L I TTN HT 
TCATCGTCCACACCACCACCGACCCCCCAGGCGCCGCCGTCACCGGCCTC 3950 
50 LIVHTTT DP PGAAVTGL 

ACCCGCACCGCACAAAACGAACACCCCGGCCGCATCCACCTCATCGAAAC 4 000 

TRTAQNEHPGRIHLIET 
CCACCACCCCCACACCCCACTCCCCCTCACCCAACTCACCACCCTCCACC 4 050 
HHPHTPLPLTQLTTLH 
55 AACCCCACCTACGCCTCACCAACAACACCCTCCACACCCCCCACCTCACC 4100 
QPHLRLTNNTLHTPHLT 
CCCATCACCACCCACCACAACACCACCACAACCACCCCCAACACCCCACC 4150 

PITTHHNTTTTTPNTPP 
CCTCAACCCCAACCACGCCATCCTCATCACCGGCGGCTCCGGCACCCTCG 4 200 
60 LNPNHAILITGGSGTL 

CCGGCATCCTCGCCCGCCACCTCAACC ACCCCCACACCTACCTCCTCTCC 4 250 
AGILARHLNHPHTYLLS 
CGCACACCACCACCCCCCACCACACCCGGCACCCACATCCCCTGCGACCT 4 300 
RTPPP. PTTPGTHIPCDL 
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CACCGACCCCACCCAAATCACCCAAGCCCTCACCCACATACCACAACCCC 4 350 

TDPTQITQALTH I P Q P 
TCACCGGCATCTTCCACACCGCCGCCACCCTCGACGACGCCACCCTCACC 4 4 00 
L T G I FHTAATLDDATLT 
5 AACCTCACCCCCCAACACCTCACCACCACCCTCCAACCCAAAGCCGACGC 4 4 50 
NLT PQHLTTTLQPKADA 
CGCCTGGCACCTCCACCACCACACCCAAAACCAACCCCTCACCCACTTCG 4 500 

AWHLHHHTQNQPLTHF 
TCCTCTACTCCAGCGCCGCCGCCACCCTCGGCAGCCCCGGCCAAGCCAAC 4 550 
10 VLYSSAAATLGSPGQAN 

TACGCCGCCGCCAACGCCTTCCTCGACGCCCTCGCCACCCACCGCCACAC 4 600 

YAAANAFLDALATHRHT 
CCAAGGACAACCCGCCACCACCATCGCCTGGGGCATGTGGCACACCACCA 4 650 
QGQPATTIAWGMWHTT 
1 5 CCAC ACTC ACC AGCCAACTCACCGACAGCGACCGCGACCGCATCCGCCGC 4 700 
TTLTSQLTDSDRDRIRR 
GGCGGCTTCCTGCCGATCTCGGACGACGAGGGCATGC 
GGFLPISDDEGM 

20 The Nhel-Xhol hybrid FK-506 PKS module 8 containing the AT domain of 

module 13 of rapamycin is shown below. 

GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 5 0 

MRLYEAARRTGS PVVV 
GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 
25 AAALDDAPDVPLLRGLR 

GCGT ACG ACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 

RTTVRRAAVRERS LAD 
GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
RSPCCPTTSAPTPPSRS 
30 TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 
SWNSTATVLGHLGAEDI 
CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 

PATTTFKELGI DS LTA 
TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
35 VQLRNALTTATGVRLNA 

ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 

TAV F D F P T P RALAARLG 
CG ACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 4 50 
DELAGTRAPVAARTAA 
40 CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
TAAAHDEPL.AIVGMACR 
CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

LPGGVAS PQELWRLVAS 
CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
45 GTDAITEFPADRGWDV 

ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
DALYDPDPDAIGKTFVR 
CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 7 00 
HGGFLDGATGFDAAF FG 
50 GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 7 50 
I S PREALAMDPQQRVL 
TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
LETSWEAFESAG I TPDA 
GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 850 
55 ARGSDTGVFIGAFSYGY 

CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 

GTGADTNG FGATGSQT 
GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
SVLSGRLSYFYGLEGPS 
60 GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 
VTVDTA CSSSLVALHQA 
AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 
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GQSLRSGECSLALVGG 
TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
VTVMAS P G G FVEFSRQR 

GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 
5 GLAF DG RAKAFGAGADG 

TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 

TS FAEGAGALVVERLS 
ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 
DAERHGHTVLALVRGSA' 
1 0 GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 
AN.S DGA S NG LSAPNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 

QERVIHQALANAKLTP 
CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 
15 ADVDAVEAHGTGTRLGD 

CCCATCGAGGCGCAGGCGCTGCTCGCG ACGTACGGACAGGACCGGGCGAC 14 50 

PI EAQALLATYGQDRAT 
GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 
PLLLGSLKSNIGHAQA 
20 CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
ASGVAG I I KMVQAI RHG 
GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1 600 

E L P PT L H A DE'PS P HVDW 
GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 
25 TAGAVELLTSARP WPG 

CCGGTCGCCCGCGCCGCGCTGCCGTCTCGTCGTTCGGCGTGAGCGGCACG 1700 
TGRPRRAAVSSFGVSGT 
AACGCCCACATCATCCTTGAGGCAGGACCGGTCAAAACGGGACCGGTCGA 1750 
NAH I ILEAGPVKTGPVE 
30 GGCAGGAGCGATCGAGGCAGGACCGGTCGAAGTAGGACCGGTCGAGGCTG 1800 
A G A I EAG PVEVG PVEA 
GACCGCTCCCCGCGGCGCCGCCGTCAGCACCGGGCG AAGACCTTCCGCTG 1850 
GPLPAAPPSAPGEDLPL 
CTCGTGTCGGCGCGTTCCCCGGAGGCACTCGACGAGCAGATCGGGCGCCT 1900 
35 LVSARS PEALDEQIGRL 

6CGCGCCTATCTCGACACCGGCCCGGGCGTCGACCGGGCGGCCGTGGCGC 1950 

R.AY LDTGPGVDRAAVA 
AGACACTGGCCCGGCGTACGCACTTCACCCACCGGGCCGTACTGCTCGGG 2000 
QT LARRTH FTHRAVLLG 
40 GACACCGTCATCGGCGCTCCCCCCGCGGACCAGGCCG ACGAACTCGTCTT 2050 
DTV I GAP PADQADELVF 
CGTCT ACTCCGGTCAGGGCACCCAGCATCCCGCGATGGGCGAGCAGCT AG 2100 

VY S GQGTQH PA MGEQL 
CCGATTCGTCGGTGGTGTTCGCCGAGCGGATGGCCGAGTGTGCGGCGGCG 2150 
45 ADSSVVFAERMAECAAA 

TTGCGCGAGTTCGTGGACTGGGATCTGTTCACGGTTCTGGATGATCCGGC 2200 

LREFVDWDLFTVLDDPA 
GGTGGTGGACCGGGTTGATGTGGTCCAGCCCGCTTCCTGGGCGATGATGG 2250 
VVDRVDVVQPASWAMM 
50 TTTCCCTGGCCGCGGTGTGGCAGGCGGCCGGTGTGCGGCCGGATGCGGTG 2300 
VS LAAVWQAAGVRPDAV 
ATCGGCCATTCGCAGGGTGAGATCGCCGCAGCTTGTGTGGCGGGTGCGGT 2350 

IG HSQGE IAAACVAGAV 
GTCACTACGCGATGCCGCCCGGATCGTGACCTTGCGCAGCCAGGCGATCG 24 00 
55 SLRDAARI VTLRSQAI 

CCCGGGGCCTGGCGGGCCGGGGCGCGATGGCATCCGTCGCCCTGCCCGCG 24 50 
ARGLAGRGAMASVALPA 
CAGGATGTCGAGCTGGTCGACGGGGCCTGGATCGCCGCCCACAACGGGCC 2 500 
QDVELVDGAWIAAHNGP 
60 CGCCTCCACCGTGATCGCGGGCACCCCGGAAGCGGTCGACCATGTCCTCA 2550 
ASTVIAGT PEAVDHVL 
CCGCTCATGAGGCACAAGGGGTGCGGGTGCGGCGGATCACCGTCGACTAT .2 600 
TAHEAQGVRVRRITVDY 
GCCTCGCACACCCCGCACGTCGAGCTGATCCGCGACGAACTACTCGACAT 2 650 
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ASHTPHVELIRDE LLDI 
CACTAGCGACAGCAGCTCGCAGACCCCGCTCGTGCCGTGGCTGTCGACCG 2700 

TSDSSSQTPLVPWLST 
TGGACGGCACCTGGGTCGACAGCCCGCTGGACGGGGAGTACTGGTACCGG 27 50 
5 VDGTWVDS PLDGEYWYR 

AACCTGCGTCAACCGGTCGGTTTCCACCCCGCCGTCAGCCAGTTGCAGGC 2800 

NLREPVGFHPAVSQLQA 
CCAGGGCGACACCGTGTTCGTCGAGGTCAGCGCCAGCCCGGTGTTGTTGC 2850 
QGDTVFVEVSASPVLL 
10 AGGCGATGGACGACGATGTCGTCACGGTTGCCACGCTGCGTCGTGACGAC 2900 
QAMDDDVVTVATLRRDD 
GGCGACGCCACCCGGATGCTCACCGCCCTGGCACAGGCCTATGTCCACGG 2950 

GDAT RMLTALAQAYVHG 
CGTCACCGTCGACTGGCCCGCCATCCTCGGCACCACCACAACCCGGGTAC 3000 
15 VTVDWPAILGTTTTRV 

TGGACCTTCCGACCTACGCCTTCCAACACCAGCGGTACTGGCTCGAGTCG 3050 
LDLPTYAFQHQRYWLES 
GCTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCCTCGGCACCGGAGT 3100 
APPATAD.SGHPVLGTGV 
20 CGCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTGCCCGCCG 3150 
AVAGS PGRVFTG PVPA 
GTGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGCCGCCGAC 3200 
GADRAV FI AELALAAAD 
GCCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCGTGCCCGG 3250 
25 ATDCATVEQLDVTSVPG 

CGG ATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCG ATGAACCCG 3300 

GSARGRATAQTWVDEP 
CCGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGGCGACGCC 3350 
AADGRRRFTVHTRVGDA 
30 CCGTGGACGCTGCACGCCG AGGGGGTTCTCCGCCCCGGCCGCGTGCCCCA 3400 
PWTLHAEGV'LRPGRVPQ 
GCCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTGCCCGCGG 34 50 

PEAVDTAWPPPGAVPA 
ACGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGTCGAAGCC 3500 
35 DGL PGAWRRADQV FVEA 

GAAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGCTCGACGC 3550 

EVDS PDGFVAH PDLLDA 
GGTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGATGGCGCG 3 600 
VFSAVGDGSRQPTGWR 
40 ACCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTGCCTCACC 3650 
DLAVHASDATVLRACLT 
CGCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTGCCGGAAT 3700 

RRDSGVVELAAFDGAGM 
GCCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCGTCGGCAG 37 50 
45 PVLTAESVTLGEVASA 

GCGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTTGCCGGTG 3800 
GGSDESDGLLRLEWLPV 
GCGGAGGCCCACT ACGACGGTGCCGACGAGCTGCCCGAGGGCTACACCCT 3850 
AEAHYDGADELPEGYTL 
50 CATCACCGCCACACACCCCGACGACCCCGACGACCCCACCAACCCCCACA 3900 
ITATHPDDPDDPTNPH 
ACACACCCACACGCACCCACACACAAACCACACGCGTCCTCACCGCCCTC 3950 
N T PT RT H TQTTRV LTAL 
CAACACCACCTCATCACCACCAACCACACCCTCATCGTCCACACCACCAC 4 000 
55 QHHLITTNHTLIVHTTT 

CGACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCACAAAACG 4 050 

DPPGAAVTGLTRTAQN 
AACACCCCGGCCGCATCCACCTCATCGAAACCCACCACCCCCACACCCCA 4100 
EHPGRI HLIETHHPHTP 
60 CTCCCCCTCACCCAACTCACCACCCTCCACCAACCCCACCTACGCCTCAC 4150 
LPLTQLTTLHQPHLRLT 
CAACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACCCACCACA 4 200 

NNTLHTPHLTPITTH..H 
ACACCACCACAACCACCCCCAACACCCCACCCCTCAACCCCAACCACGCC 4 250 
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NTTTTTPNTPPLNPNHA 
ATCCTCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCA 4 300 

ILITGGSGTLAGILARH 
CCTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCACCCCCCA 4 350 
5 LNH PHTYLLSRTPPPP 

CCACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCACCCAAATC 4 4 00 
TTPGTHI PCDLTDPTQI 
ACCCAAGCCCTCACCCACATACC AC AACCCCTCACCGGCATCTTCCACAC 4 4 50 
TQALTHI PQPLTGIFHT 
10 CGCCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCCCAACACC 4 500 
AATLDDATLTNLTPQH 
TCACC ACCACCCTCC AACCCAAAGCCGACGCCGCCTGGCACCTCCACCAC 4 550 
LTTTLQPKADAAWHLHH 
CACACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCCGC 4 600 
15 HTQNQ PLT H F.VLYS SAA 

CGCCACCCTCGGCAGCCCCGGCC AAGCCAACTACGCCGCCGCCAACGCCT 4 650 

AT LGS PGQANYAAANA 
TCCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACCCGCCACC 4 7 00 
FLDALATHRHTQGQ PAT 
20 ACCATCGCCTGGGGCATGTGGCACACCACCACCACACTCACCAGCCAACT 4 7 50 
TIAWGMWHTTTTLTSQL 
CACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTGCCGATCT 4 800 

TDSDRDRI RRGGFLPI 
CGGACGACGAGGGCATGC 
25 S D D E G M 

Example 3 

Recombinant PKS Genes for 13-desmethoxy FK-506 and FK-520 
The present invention provides a variety of recombinant PKS genes in addition to 

30 those described in Examples 1 and 2 for producing 13-desmethoxy FK-506 and FK-520 
compounds. This Example provides the construction protocols for recombinant FK-520 
and FK-506 (from Streptomyces sp. MA6858 (ATCC 55098), described in U.S. Patent 
Nos. 5,1 16,756, incorporated herein by reference) PKS genes in which the module 8 AT 
coding sequences have been replaced by either the rap AT3 (the AT domain from module 

35 3 of the rapamycin PKS), rap ATI 2, <?n>ATl (the AT domain from module 1 of the 
erythromycin (DEBS) PKS), or ery AT2 coding sequences. Each of these constructs 
provides a PKS that produces the 13-desmethoxy- 13-methyl derivative, except for the 
rapAT12 replacement, which provides the 13-desmethoxy derivative, i.e., it has a 
hydrogen where the other derivatives have methyl. 

40 Figure 7 shows the process used to generate the AT replacement constructs. 

First, a fragment of -4.5 kb containing module 8 coding sequences from the FK-520 
cluster of ATCC 14891 was cloned using the convenient restriction sites Sad and Sphl 
(Step A in Figure 7). The choice of restriction sites used to clone a 4.0 - 4.5 kb fragment 
comprising module 8 coding sequences from other FK-520 or FK-506 clusters can be 

45 different depending on the DNA sequence, but the overall scheme is identical. The 

unique Sad and Sphl restriction sites at the ends of the FK-520 module 8 fragment were 
then changed to unique Bgl II and Nsil sites by ligation to synthetic linkers (described in 
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the preceding Examples, see Step B of Figure 7). Fragments containing sequences 5' and 
3* of the AT8 sequences were then amplified using primers, described above, that 
introduced either an Avrll site or an Nhel site at two different KS/AT boundaries and an 
Xhol site at the AT7DH boundary (Step C of Figure 7). Heterologous AT domains from 
5 the rapamycin and erythromycin gene clusters were amplified using primers, as 

described above, that introduced the same sites as just described (Step D of Figure 7). 
The fragments were ligated to give hybrid modules with in- frame fusions at the KS/AT 
and AT/DH boundaries (Step E of Figure 7). Finally, these hybrid modules were ligated 
into the BamHl and Pstl sites of the KC515 vector. The resulting recombinant phage 
10 were used to transform the FK-506 and FK-520 producer strains to yield the desired 
recombinant cells, as described in the preceding Examples. 

The following table shows the location and sequences surrounding the engineered 
site of each of the heterologous AT domains employed. The FK-506 hybrid construct 
was used as a control for the FK-520 recombinant cells produced, and a similar FK-520 
1 5 hybrid construct was used as a control for the FK-506 recombinant ceils. 



Heterologous AT 


Enzyme 


Location of Engineered Site 


FK-506 AT8 
(hydroxymalonyl) 


Avrll 
Nhel 
Xhol 


GGCCGTccqcqcCGTGCGGCGGTCTCGTCGTTC 
GRPRRAAVSSF 

ACCC AGCATCCCGCGATGGGTGAGCGgc t cqcC 
TQHPAMGERLA 

TACGCCTTCCAGCGGCGGCCCTACTGGa t cqaq 
YAFQRRPYWIE 


rapamycin AT3 
(methylmalonyl) 


Avrll 
Nhel 
Xhol 


GACCGGccccqtCGGGCGGGCGTGTCGTCCTTC 
DRPRRAGVSSF 

TGGCAGTGGCTGGGGATGGGCAGTGCcctqcqG 
WQWLGMGSALR 

TACGCCTTCCAACACCAGCGGTACTGGqtcqaq 
YAFQHQRYWVE 


rapamycin AT 12 
(malonyl) 


Avrll 
Nhel 
Xhol 


GGCCGAgcgcqcCGGGCAGGCGTGTCGTCCTTC 
GRARRAGVSSF 

TCGCAGCGTGCTGGCATGGGTGAGGAactqgcC 
SQRAGMGEELA 

TACGCCTTCCAGCACCAGCGCTACTGGctcqaq 
YAFQHQRYWLE 


DEBS ATI 
(methylmalonyl) 


Avrll 
Nhel 
Xhol 


GCGCGAccqcqcCGGGCGGGGGTCTCGTCGTTC 
ARPRRAGVSSF 

TGGCAGTGGGCGGGCATGGCCGTCGAcctqctC 
WQWAGMAVDLL 

TACCCGTTCCAGCGCGAGCGCGTCTGGctcqaa 
YPFQR ERVWLE 


DEBS AT2 
(methylmalonyl) 


Avrll 


GACGGGqtqcqcCGGGCAGGTGTGTCGGCGTTC 

DGVRRAGVSAF 
GCCCAGTGGGAAGGCATGGCGCGGGAgt tattG 
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Nhel 


AQWEGMARELL 






TATCCTTTCCAGGGCAAGCGGTTCTGGctqctq 




Xhol 


YPFQGKRFWLL 
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The sequences shown below provide the location of the KS/AT boundaries 
chosen in the FK-520 module 8 coding sequences. Regions where Avrll and Nhel sites 
were engineered are indicated by lower case and underlining. 

CCGGCGCCGTCGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGccacqqC 
5 AGAVELLTSARPWPETDRPR 

GTGCCGCCGTCTCCTCGTTCGGGGTGAGCGGCACCAACGCCCACGTCATCCTGGAGGCCG 
RAAV'SS FGVSGTNAHVI LEA 

GACCGGTAACGGAGACGCCCGCGGCATCGCCTTCCGGTGACCTTCCCCTGCTGGTGTCGG 
GPVTETPAASPSGDLPLLVS 

10 CACGCTCACCGGAAGCGCTCGACGAGCAGATCCGCCGACTGCGCGCCTACCTGGACACCA 
ARSPEALDEQIRRLRAYLDT 
CCCCGGACGTCGACCGGGTGGCCGTGGCACAGACGCTGGCCCGGCGCACACACTTCGCCC 
TPDVDRVAVAQTLARRTHFA 
ACCGCGCCGTGCTGCTCGGTGACACCGTCATCACCACACCCCCCGCGGACCGGCCCGACG 

15 HRAVLLGDTVITTPPADRPD 

AACTCGTCTTCGTCTACTCCGGCCAGGGCACCCAGCATCCCGCGATGGGCGAGCAqctcq 
ELVFVYSGQGTQHPAMGEQL 
cCGCCGCCCATCCCGTGTTCGCCGACGCCTGGCATGAAGCGCTCCGCCGCCTTGACAACC 
AAAH PVFAD AWHEALRRLDN 

20 

The sequences shown below provide the location of the AT/DH boundary chosen 
in the FK-520 module 8 coding sequences. The region where an Xhol site was 
engineered is indicated by lower case and underlining. 

TCCTCGGGGCTGGGTCACGGCACGACGCGGATGTGCCCGCGTACGCGTTCCAACGGCGGC 
25 I LGAGS RHDADVPAYAFQRR 

ACTACTGGatC£agTCGGCACGCCCGGCCGCATCCGACGCGGGCCACCCCGTGCTGGGCT 
HYW I ESARPA ASDAGH PVLG 

The sequences shown below provide the location of the KS/AT boundaries 
30 chosen in the FK-506 module 8 coding sequences. Regions where Avrll and Nhel sites 
were engineered are indicated by lower case and underlining. 

TCGGCCAGGCCGTGGCCGCGGACCGGCCGT ccqcqc CGTGCGGCGGTCTCGTCGTTCGGG 

SARPWPRTGRPRRAAVSSFG 
GTGAGCGGCACCAACGCCCACATCATCCTGGAGGCCGGACCCGACCAGGAGGAGCCGTCG 
35 VSGTNA H I I LEAGPDQEEPS 

GCAGAACCGGCCGGTGACCTCCCGCTGCTCGTGTCGGCACGGTCCCCGGAGGCACTGGAC 

AE PAG DLPLLVSARS PEALD 
GAGCACATCGGGCGCCTGCGCGACTATCTCGACGCCGCCCCCGGCGTGGACCTGGCGGCC 
EQIGRLRDYLDAAPGVDLAA 
40 GTGGCGCGGACACTGGCCACGCGTACGCACTTCTCCCACCGCGCCGTACTGCTCGGTGAC 
VARTLATRTHFSHRAVLLGD 
ACCGTCATCACCGCTCCCCCCGTGGAACAGCCGGGCGAGCTCGTCTTCGTCTACTCGGGA 

TVITAPPVEQPGELVFVYSG 
CAGGGCACCCAGCATCCCGCGATGGGTGAGCG qctcqc CGCAGCCTTCCCCGTGTTCGCC 
45 QGTQH PAMGERLAAAFPVFA 

GACCCGGACGTACCCGCCTACGCCTTCCAGCGGCGGCCCTACTGGATCGAGTCCGCGCCG 
DPDVPAYAFQRRPYWI ESAP 

The sequences shown below provide the location of the AT/DH boundary chosen 
50 in the FK-506 module 8 coding sequences. The region where an Xhol site was 
engineered is indicated by lower case and underlining. 

GACCCGGACGTACCCGCCTACGCCTTCCAGCGGCGGCGCTACTGG atcqaq TCCGCGCCG 
DPDVPAYAFQRRPYWI ESAP 
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Example 4 

Replacement of Methoxyl with Hydrogen or Methyl at C-15 of FK-506 and FK-520 
The methods and reagents of the present invention also provide novel FK-506 
5 and FK-520 derivatives in which the methoxy group at C-15 is replaced by a hydrogen or 
methyl. These derivatives are produced in recombinant host cells of the invention that 
express recombinant PKS enzymes the produce the derivatives. These recombinant PKS 
enzymes are prepared in accordance with the methodology of Examples 1 and 2, with the 
exception that AT domain of module 7, instead of module 8, is replaced. Moreover, the 
10 present invention provides recombinant PKS enzymes in which the AT domains of both 
modules 7 and 8 have been changed. The table below summarizes the various 
compounds provided by the present invention. 





Compound 


C-13 


C-15 


Derivative Provided 


15 


FK-506 


hydrogen 


hydrogen 


13, 15-didesmethoxy-FK-506 




FK-506 


hydrogen 


methoxy 


1 3 -desmethoxy-FK-506 




FK-506 


hydrogen 


methyl 


1 3,1 5-didesmethoxy-l 5-methyl-FK-506 




FK-506 


methoxy 


hydrogen 


1 5-desmethoxy-FK-506 




FK-506 


methoxy 


methoxy 


Original Compound ~ FK-506 


20 


FK-506 


methoxy 


methyl 


1 5-desmethoxy- 1 5-methyl-FK-506 




FK-506 


methyl 


hydrogen 


13,1 5-didesmethoxy- 1 3-methy l-FK-506 




FK-506 


methyl 


methoxy 


1 3-desmethoxy- 1 3-methyl-FK-506 




FK-506 


methyl 


methyl 


13,1 5-didesmethoxy- 13,1 5-dimethyl-FK-506 




FK-520 


hydrogen 


hydrogen 


13, 1 5-didesmethoxy FK-520 


25 


FK-520 


hydrogen 


methoxy 


1 3-desmethoxy FK-520 




FK-520 


hydrogen 


methyl 


13,1 5-didesmethoxy- 1 5-methyl-FK-520 




FK-520 


methoxy 


hydrogen 


1 5-desmethoxy-FK-520 




FK-520 


methoxy 


methoxy 


Original Compound -- FK-520 




FK-520 


methoxy 


methyl 


1 5-desmethoxy- 1 5-methyl-FK-520 


30 


FK-520 


methyl 


hydrogen 


13,1 5-didesmethoxy- 1 3-methyl-FK-520 




FK-520 


methyl 


methoxy 


1 3-desmethoxy-13-methyl-FK-520 




FK-520 


methyl 


methyl 


13,1 5-didesmethoxy- 13,1 5-dimethyl-FK-520 



Example 5 

35 Replacement of Methoxyl with Ethyl at C-13 and/or C-15 of FK-506 and FK-520 
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The present invention also provides novel FK-506 and FK-520 derivative 
compounds in which the methoxy groups at either or both the C- 13 and C-15 positions 
are instead ethyl groups. These compounds are produced by novel PKS enzymes of the 
invention in which the AT domains of modules 8 and/or 7 are converted to ethylmalonyl 
5 specific AT domains by modification of the PKS gene that encodes the module. 

Ethylmalonyl specific AT domain coding sequences can be obtained from, for example, 
the FK-520 PKS genes, the niddamycin PKS genes, and the tylosin PKS genes. The 
novel PKS genes of the invention include not only those in which either or both of the 
AT domains of modules 7 and 8 have been converted to ethylmalonyl specific AT 
1 0 domains but also those in which one of the modules is converted to an ethylmalonyl 

specific AT domain and the other is converted to a malonyl specific or a methylmalonyl 
specific AT domain. 

Example 6 

15 Neurotrophic Compounds 

The compounds described in Examples 1 - 4, inclusive have immunosuppressant 
activity and can be employed as immunosuppressants in a manner and in formulations 
similar to those employed for FK-506. The compounds of the invention are generally 
effective for the prevention of organ rejection in patients receiving organ transplants and 

20 in particular can be used for immunosuppression following orthotopic liver 

transplantation. These compounds also have pharmacokinetic properties and metabolism 
that are more advantageous for certain applications relative to those of FK-506 or FK- 
520. These compounds are also neurotrophic; however, for use as neurotrophic, it is 
desirable to modify the compounds' to diminish or abolish their immunosuppressant 

25 activity. This can be readily accomplished by hydroxylating the compounds at the C- 1 8 
position using established chemical methodology or novel FK-520 PKS genes provided 
by the present invention. 

Thus, in one aspect, the present invention provides a method for stimulating 
nerve growth that comprises administering a therapeutically effective dose of 18- 

30 hydroxy-FK-520. In another embodiment, the compound administered is a C- 18,20- 

dihydroxy-FK-520 derivative. In another embodiment, the compound administered is a 
CM3-desmethoxy and/or C-15-desmethoxy 18-hydroxy-FK-520 derivative. In another 
embodiment, the compound administered is a G-13-desmethoxy and/or C-15- 
desmethoxy 1 8,20-dihydroxy-FK-520 derivative. In other embodiments, the compounds 

35 are the corresponding analogs of FK-506. The 18-hydroxy compounds of the invention 
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can be prepared chemically, as described in U.S. Patent No. 5,189,042, incorporated 
herein by reference, or by fermentation of a recombinant host cell provided by the 
present invention that expresses a recombinant PKS in which the module 5 DH domain 
has been deleted or rendered non-functional. 
5 The chemical methodology is as follows. A compound of the invention (-200 

mg) is dissolved in 3 mL of dry methylene chloride and added to 45 |iL of 2,6-lutidine, 
and the mixture stirred at room temperature. After 10 minutes, tert-butyldimethylsilyl 
trifluoromethanesulfonate (64 ^L) is added by syringe. After 15 minutes, the reaction 
mixture is diluted with ethyl acetate, washed with saturated bicarbonate, washed with 

1 0 brine, and the organic phase dried over magnesium sulfate. Removal of solvent in vacuo 
and flash chromatography on silica gel (ethyl acetate: hexane (1:2) plus 1% methanol) 
gives the protected compound, which is dissolved in 95% ethanol (2.2 mL) and to which 
is added 53 \ih of pyridine, followed by selenium dioxide (58 mg). The flask is fitted 
with a water condenser and heated to 70°C on a mantle. After 20 hours, the mixture is 

15 cooled to room temperature, filtered through diatomaceous earth, and the filtrate poured 
into ? saturated sodium bicarbonate solution. This is extracted with ethyl acetate, and the 
organic phase is washed with brine and dried over magnesium sulfate. The solution is 
concentrated and purified by flash chromatography on silica gel (ethyl acetate: hexane 
(1 :2) plus 1% methanol) to give the protected 1 8-hydroxy compound. This compound is 

20 dissolved in acetonitrile and treated with aqueous HF to remove the protecting groups. 
After dilution with ethyl acetate, the mixture is washed with saturated bicarbonate and 
brine, dried over magnesium sulfate, filtered, and evaporated to yield thel 8-hydroxy 

f 

compound. Thus, the present invention provides the C-l 8-hydroxy I derivatives of the 
compounds described in Examples 1-4. 

25 Those of skill in the art will recognize that other suitable chemical procedures can 

be used to prepare the novel 1 8-hydroxy compounds of the invention. See, e.g., Kawai et 
at., Jan. 1993, Structure-activity profiles of macrolactam immunosuppressant FK-506 
analogues, FEBS Letters 316(2): 107-1 13, incorporated herein by reference. These 
methods can be used to prepare both the C18-[5]-OH and C18-[/?]-OH enantiomers, with 

30 the R enantiomer showing a somewhat lower ICso* which may be preferred in some 

applications. See Kawai et al., supra. Another preferred protocol is described in Umbreit 
and Sharpless, 1977, J ACS 99(16): 1526-28, although it may be preferable to use 30 
equivalents each of SeCh and t-BuOOH rather than the 0.02 and 3-4 equivalents, 
respectively, described in that reference. 

Ill 
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All scientific and patent publications referenced herein are hereby incorporated 
by reference. The invention having now been described by way of written description 
and example, those of skill in the art will recognize that the invention can be practiced in 
a variety of embodiments, that the foregoing description and example is for purposes of 
5 illustration and not limitation of the following claims. 
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Claims 

K An isolated nucleic acid that encodes a CoA ligase, a non-ribosomal peptide 
synthetase, or a domain of an extender module of a polyketide synthase enzyme that 
synthesizes FK-520. 

2. The isolated nucleic acid of claim 1 that encodes an extender module, said 
module comprising a ketosynthase domain, an acyl transferase domain, and an acyl 
carrier protein domain. 



10 3. The isolated nucleic acid of claim 1 that encodes an open reading frame, said 

open reading frame comprising coding sequences for two or more extender modules, 
each extender module comprising a ketosynthase domain, an acyl transferase domain, 
and an acyl carrier protein domain. 

15 4. The isolated nucleic acid of claim 1 that encodes a gene cluster, said gene 

cluster comprising two or more open reading frames, each of said open reading frames 
comprising coding sequences for two or more extender modules, each of said extender 
modules comprising a ketosynthase domain, an acyl transferase domain, and an acyl 
carrier protein domain. 

20 

5. The isolated nucleic acid of claim 2, wherein at least one of said domains is a 
domain of a module of a non-FK-520 polyketide synthase. 

6. The isolated nucleic acid of claim 1, wherein said nucleic acid is a 

25 recombinant vector capable of replication in or integration into the chromosome of a host 
cell. 

7. The isolated nucleic acid of claim 6 that is selected from the group consisting 
of cosmid pKOS034-120, cosmid pKOS034-l24, cosmid pKOS065-M27, and cosmid 

30 pKOS065-M21. 

8. The isolated nucleic acid of claim 5, wherein said non-FK-520 polyketide 
synthase is rapamycin polyketide synthase, FK-506 polyketide synthase, or erythromcyin 
polyketide synthase. 

35 
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9. A method of preparing a polyketide, said method comprising transforming a 
host cell with a recombinant DNA vector of claim 6, and culturing said host cell under 
conditions such that said polyketide synthase is produced and catalyzes synthesis of said 
polyketide. 

1 0. The method of claim 9, wherein said host cell is a Streptomyces host cell. 



1 1 . The method of claim 9, wherein said polyketide is selected from the group 
consisting of FK-520, 13-desmethoxy-FK-520, and 1 3-desmethoxy-FK-506. 

10 

12. A recombinant host cell that expresses a recombinant polyketide synthase 
selected from the group consisting of: (i) an FK-520 polyketide synthase in which at 
least one AT domain is replaced by an AT domain of a non-FK-520 polyketide synthase; 
(ii) an FK-506 polyketide synthase in which at least one AT domain is replaced by an 

15 AT domain of a non-FK-506 polyketide synthase; (iii) an FK-520 polyketide synthase in 
which at least one DH domain has been deleted; (iv) an FK-506 polyketide synthase in 
which at least one DH domain has been deleted. 



13. The recombinant host cell of claim 12 that expresses an FK-520 polyketide 
20 synthase in which an AT domain of module 8 has been replaced by an AT domain that 
binds malonyl CoA, methylmalonyl Co A, or ethylmalonyl CoA. 



14. The recombinant host cell of claim 12 that expresses an FK-506 polyketide 
synthase in which an AT domain of module 8 has been replaced by an AT domain that 

25 binds malonyl CoA, methylmalonyl CoA, or ethylmalonyl CoA. 

15. The recombinant host cell of claim 13, wherein a DH domain of module 5 or 
module 6 has been deleted. 

30 16. The recombinant host cell of claim 14, wherein a DH domain of module 5 or 

module 6 has been deleted. 



35 



17. A recombinant host cell that comprises recombinant genes coding for 
enzymes sufficient for synthesis of ethylmalonyl CoA or 2-hydroxymalonyi CoA. 
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18. A polyketide having the structure 
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5 wherein, R\ is hydrogen, methyl, ethyl, or allyl; R2 is hydrogen or hydroxyl, provided 
that when R 2 is hydrogen, there is a double bond between C-20 and C-19; R 3 is hydrogen 
or hydroxyl; R4 is methoxyl, hydrogen, methyl, or ethyl; and R 5 is methoxyl, hydrogen, 
methyl, or ethyl; but not including FK-506, FK-520, 1 8-hydroxy-FK-520, and 18- 
hydroxy-FK-506. 

0 

19. The polyketide of claim 18 that is 13-desmethoxy-FK-506. 

20. The polyketide of claim 18 that is 13-desmethoxy-18-hydroxy-FK-520. 
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POLYKETIDE SYNTHASE ENZYMES AND RECOMBINANT DNA CONSTRUCTS 

THEREFOR 



^ Field of the Invention 

The present invention relates to polyketides and the polyketidc synthase (PKS) 
enzymes that produce them. The invention also relates generally to genes encoding PKS 
enzymes and to recombinant host cells containing such genes and in which expression of 
such genes leads to the production of polyketides. The present invention also relates to 
10 compounds useful as medicaments having immunosuppressive and/or neurotrophic activity. 
Thus, the invention relates to the fields of chemistry, molecular biology, and agricultural, 
medical, and veterinary technology. 

Background of the Invention 

1 5 Polyketides are a class of compounds synthesized from 2-carbon units through a 

series of condensations and subsequent modifications. Polyketides occur in many types of 
organisms, including fungi and mycelial bacteria, in particular, the actinomycetes. 
Polyketides are biologically active molecules with a wide variety of structures, and the class 
encompasses numerous compounds with diverse activities. Tetracycline, erythromycin, 

20 epothilone, FK-506, FK-520, narbomycin, picromycin, rapamycin, spinocyn, and tylosin are 
examples of polyketides. Given the difficulty in producing polyketide compounds by 
traditional chemical methodology, and the typically low production of polyketides in wild- 
type cells, there has been considerable interest in finding improved or alternate means to 
produce polyketide compounds. 

25 This interest has resulted in the cloning, analysis, and manipulation by recombinant 

DNA technology of genes that encode PKS enzymes. The resulting technology allows one 
to manipulate a known PKS gene cluster either to produce the polyketide synthesized by 
that PKS at higher levels than occur in nature or in hosts that otherwise do not produce the 
polyketide. The technology also allows one to produce molecules that are structurally 

30 related to, but distinct from, the polyketides produced from known PKS gene clusters. See, 
e.g., PCT publication Nos. WO 93/13663; 95/08548; 96/40968; 97/02358; 98/27203; and 
98/49315; United States Patent Nos. 4,874,748; 5,063,155; 5,098,837; 5,149,639; 
5,672,491; 5,712,146; 5,830,750; and 5,843,718; and Fu et aL, 1994, Biochemistry 33: 
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9321-9326; McDaniel et aL, 1993, Science 262: 1546-1550; and Rohr, 1995. Angew. Chem. 
Int. Ed. Engl 34(8): 881-888, each of which is incorporated herein by reference. 

Polyketides are synthesized in nature by PKS enzymes. These enzymes, which are 
complexes of multiple large proteins, are similar to the synthases that catalyze condensation 
5 of 2-carbon units in the biosynthesis of fatty acids. PKSs catalyze the biosynthesis of 

polyketides through repeated, decarboxylase Claisen condensations between acylthioester 
building blocks. The building blocks used to form complex polyketides are typically 
acylthioesters, such as acetyl, butyryl, propionyl, malonyl, hydroxymalonyl, 
methylmalonyl, and ethylmalonyl CoA. Other building blocks include amino acid like 
10 acylthioesters. PKS enzymes that incorporate such building blocks include an activity that 
functions as an amino acid ligase (an AMP ligase) or as a non-ribosomal peptide synthetase 
(NRPS). Two major types of PKS enzymes are known; these differ in their composition and 
mode of synthesis of the polyketide synthesized. These two major types of PKS enzymes 
are commonly referred to as Type I or "modular" and Type II "iterative" PICS enzymes. 
1 5 In the Type I or modular PKS enzyme group, a set of separate catalytic active sites 

(each active site is termed a "domain", and a set thereof is termed a "module") exists for 
each cycle of carbon chain elongation and modification in the polyketide synthesis pathway. 
The typical modular PKS is composed of several large polypeptides, which can be 
segregated from amino to carboxy termini into a loading module, multiple extender 
20 modules, and a releasing (or thioesterase) domain. The PKS enzyme known as 6- 

deoxyerythronolide B synthase (DEBS) is a Type I PKS. In DEBS, there is a loading 
module, six extender modules, and a thioesterase (TE) domain. The loading module, six 
extender modules, and TE of DEBS are present on three separate proteins (designated 
DEBS-1, DEBS-2. and DEBS-3, with two extender modules per protein). Each of the 
25 DEBS polypeptides is encoded by a separate open reading frame (ORF) or gene; these 

genes are known as eryAl, eryAII, and eryAIIL See Caffrey et a/., 1992, FEBS Letters 304. 
205, and U.S. Patent No. 5,824,513, each of which is incorporated herein by reference. 

Generally, the loading module is responsible for binding the first building block 
used to synthesize the polyketide and transferring it to the first extender module. The 
30 loading module of DEBS consists of an acyitransferase (AT) domain and an acyi carrier 

protein (ACP) domain. Another type of loading module utilizes an inactivated ketosynthase 
(K.S) domain and AT and ACP domains. This inactivated KS is in some instances called 
KS Q , where the superscript letter is the abbreviation for the amino acid, glutamine, that is 
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present instead of the active site cysteine required for ketosynthase activity. In other PKS 
enzymes, including the FK-506 PKS, the loading module incorporates an unusual starter 
unit and is composed of a Co A ligase like activity domain. In any event, the loading module 
recognizes a particular acyl-CoA (usually acetyl or propionyl but sometimes butyryl or 
5 other acyl-CoA) and transfers it as a thiol ester to the ACP of the loading module. 

The AT on each of the extender modules recognizes a particular extender-Co A 
(malonyl or alpha-substituted malonyl, i.e., methylmalonyl, ethylmalonyl, and 2- 
hydroxymalonyl) and transfers it to the ACP of that extender module to form a thioester. 
Each extender module is responsible for accepting a compound from a prior module, 
1 0 binding a building block, attaching the building block to the compound from the prior 
module, optionally performing one or more additional functions, and transferring the 
resulting compound to the next module. 

Each extender module of a modular PKS contains a KS, AT, ACP, and zero, one, 
two, or three domains that modify the beta-carbon of the growing polyketide chain. A 
15 typical (non-loading) minimal Type I PKS extender module is exemplified by extender 

module three of DEBS, which contains a KS domain, an AT domain, and an ACP domain. 
These three domains are sufficient to activate a 2-carbon extender unit and attach it to the 
growing polyketide molecule. The next extender module, in turn, is responsible for 
attaching the next building block and transferring the growing compound to the next 
20 extender module until synthesis is complete. 

Once the PKS is primed with acyl- and malonyl- ACPs, the acyl group of the loading 
module is transferred to form a thiol ester (trans-esterification) at the KS of the first 
extender module; at this stage, extender module one possesses an acyl-KS and a malonyl (or 
substituted malonyl) ACP. The acyl group derived from the loading module is then 
25 covalently attached to the alpha-carbon of the malonyl group to form a carbon-carbon bond, 
driven by concomitant decarboxylation, and generating a new acyl-ACP that has a backbone 
two carbons longer than the loading building block (elongation or extension). 

The polyketide chain, growing by two carbons each extender module, is sequentially 
passed as covalently bound thiol esters from extender module to extender module, in an 
30 assembly line-like process. The carbon chain produced by this process alone would possess 
a ketone at every other carbon atom, producing a polyketone, from which the name 
polyketide arises. Most commonly, however, additional enzymatic activities modify the beta 
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keto group of each two carbon unit just after it has been added to the growing polyketide 
chain but before it is transferred to the next module. 

Thus, in addition to the minimal module containing KS, AT, and ACP domains 
necessary to form the carbon-carbon bond, and as noted above, other domains that modify 
the beta-carbonyl moiety can be present. Thus, modules may contain a ketoreductase (KR) 
domain that reduces the keto group to an alcohol. Modules may also contain a KR domain 
plus a dehydratase (DH) domain that dehydrates the alcohol to a double bond. Modules may 
also contain a KR domain, a DH domain, and an enoylreductase (ER) domain that converts 
the double bond product to a saturated single bond using the beta carbon as a methylene 
function. An extender module can also contain other enzymatic activities, such as, for 
example, a methylase or dimethylase activity. 

After traversing the final extender module, the polyketide encounters a releasing 
domain that cleaves the polyketide from the PKS and typically cyclizes the polyketide. For 
example, final synthesis of 6-dEB is regulated by a TE domain located at the end of 
1 5 extender module six. In the synthesis of 6-dEB, the TE domain catalyzes cyclization of the 
macrolide ring by formation of an ester linkage. In FK-506, FK-520, rapamycin, and similar 
polyketides, the TE activity is replaced by a RapP (for rapamycin) or RapP like activity that 
makes a linkage incorporating a pipecolate acid residue. The enzymatic activity that 
catalyzes this incorporation for the rapamycin enzyme is known as RapP, encoded by the 
20 rapP gene. The polyketide can be modified further by tailoring enzymes; these enzymes add 
carbohydrate groups or methyl groups, or make other modifications, i.e., oxidation or 
reduction, on the polyketide core molecule. For example, 6-dEB is hydroxyiated at C-6 and 
C-12 and glycosylated at C-3 and G-5 in the synthesis of erythromycin A. 

In Type I PKS polypeptides, the order of catalytic domains is conserved. When all 
25 beta-keto processing domains are present in a module, the order of domains in that module 
from N-to-C-terminus is always KS, AT, DH, ER, KR, and ACP. Some or all of the beta- 
keto processing domains may be missing in particular modules, but the order of the domains 
present in a module remains the same. The order of domains within modules is believed to 
be important for proper folding of the PKS polypetides into an active complex. Importantly, 
30 there is considerable flexibility in PKS enzymes, which allows for the genetic engineering 
of novel catalytic complexes. The engineering of these enzymes is achieved by modifying, 
adding, or deleting domains, or replacing them with those taken from other Type 1 PKS 
enzymes. It is also achieved by deleting, replacing, or adding entire modules with those 



SUBSTITUTE SHEET (RULE 26) 



BNSDOCID- <WO 0O2O6O1A2 IA> 



WO 00/20601 PCTAJS99/22886 

5 

taken from other sources. A genetically engineered PKS complex should of course have the 
ability to catalyze the synthesis of the product predicted from the genetic alterations made. 

Alignments of the many available amino acid sequences for Type I PKS enzymes 
has approximately defined the boundaries of the various catalytic domains. Sequence 
5 alignments also have revealed linker regions between the catalytic domains and at the Isl- 
and C-termini of individual polypeptides. The sequences of these linker regions are less 
well conserved than are those for the catalytic domains, which is in part how linker regions 
are identified. Linker regions can be important for proper association between domains and 
between the individual polypeptides that comprise the PKS complex. One can thus view the 
1 0 linkers and domains together as creating a scaffold on which the domains and modules are 
positioned in the correct orientation to be active. This organization and positioning, if 
retained, permits PKS domains of different or identical substrate specificities to be 
substituted (usually at the DNA level) between PKS enzymes by various available 
methodologies. In selecting the boundaries of, for example, an AT replacement, one can 
1 5 thus make the replacement so as to retain the linkers of the recipient PKS or to replace them 
with the linkers of the donor PKS AT domain, or, preferably, make both constructs to 
ensure that the correct linker regions between the KS and AT domains have been included 
in at least one of the engineered enzymes. Thus, there is considerable flexibility in the 
design of new PKS enzymes with the result that known polyketides can be produced more 
20 effectively, and novel polyketides useful as pharmaceuticals or for other purposes can be 
made. 

By appropriate application of recombinant DNA technology, a wide variety of 
polyketides can be prepared in a variety of different host cells provided one has access to 
nucleic acid compounds that encode PKS proteins and polyketide modification enzymes. 

25 The present invention helps meet the need for such nucleic acid compounds by providing 

recombinant vectors that encode the FK-520 PKS enzyme and various FK-520 modification 
enzymes. Moreover, while the FK-506 and FK-520 polyketides have many useful activities, 
there remains a need for compounds with similar useful activities but with better 
pharmacokinetic profile and metabolism and fewer side-effects. The present invention helps 

30 meet the need for such compounds as well. 

Summarv of the Invention 
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In one embodiment, the present invention provides recombinant DNA vectors that 
encode all or pan of the FK-520 PKS enzyme. Illustrative vectors of the invention include 
cosmid pKOS034-120, pKOS034-124, pKOS065-C3L pKOS065-C3, pKOS065-M27, and 
pKOS065-M21. The invention also provides nucleic acid compounds that encode the 
5 various domains of the FK-520 PKS, i.e., the KS, AT. ACP, KR, DH, and ER domains. 
These compounds can be readily used, alone or in combination with nucleic acids encoding 
other FK-520 or non-FK-520 PKS domains, as intermediates in the construction of 
recombinant vectors that encode all or part of PKS enzymes that make novel polyketides. 
The invention also provides isolated nucleic acids that encode all or pan of one or 

1 0 more modules of the FK-520 PKS, each module comprising a ketosynthase activity, an acyl 
transferase activity, and an acyl carrier protein activity. The invention provides an isolated 
nucleic acid that encodes one or more open reading frames of FK-520 PKS genes, said open 
reading frames comprising coding sequences for a CoA ligase activity, an NRPS activity, or 
two or more extender modules. The invention also provides recombinant expression vectors 

1 5 containing these nucleic acids. 

In another embodiment, the invention provides isolated nucleic acids that encode all 
or a pan of a PKS that contains at least one module in which at least one of the domains in 
the module is a domain from a non-FK-520 PKS and at least one domain is from the FK- 
520 PKS. The non-FK-520 PKS domain or module originates from the rapamycin PKS, the 

20 FK-506 PKS, DEBS, or another PKS. The invention also provides recombinant expression 
vectors containing these nucleic acids. 

In another embodiment, the invention provides a method of preparing a polyketide. 
said method comprising transforming a host cell with a recombinant DNA vector that 
encodes at least one module of a PKS, said module comprising at least one FK-520 PKS 

25 domain, and culturing said host cell under conditions such that said PKS is produced and 
catalyzes synthesis of said polyketide. In one aspect, the method is practiced with a 
Streptomyces host cell. In another aspect, the polyketide produced is FK-520. In another 
aspect, the polyketide produced is a polyketide related in structure to FK-520. In another 
aspect, the polyketide produced is a polyketide related in structure to FK-506 or rapamycin. 

30 In another embodiment, the invention provides a set of genes in recombinant form 

sufficient for the synthesis of ethylmalonyl CoA in a heterologous host cell. These genes 
and the methods of the invention enable one to create recombinant host cells with the ability 
to produce polyketides or other compounds that require ethylmalonyl CoA for biosynthesis. 
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The invention also provides recombinant nucleic acids that encode AT domains specific for 
ethylmalonyl CoA. Thus, the compounds of the invention can be used to produce 
polyketides requinng ethylmalonyl CoA in host cells that otherwise are unable to produce 
such polyketides. 

5 In another embodiment, the invention provides a set of genes in recombinant form 

sufficient for the synthesis of 2-hydroxymaIonyl CoA and 2-methoxymalonyl CoA in a 
heterologous host cell. These genes and the methods of the invention enable one to create 
recombinant host cells with the ability to produce polyketides or other compounds that 
require 2-hydroxymalonyl CoA for biosynthesis. The invention also provides recombinant 
1 0 nucleic acids that encode AT domains specific for 2-hydroxymalonyl CoA and 2- 

methoxymalonyl CoA. Thus, the compounds of the invention can be used to produce 
polyketides requiring 2-hydroxymalonyl CoA or 2-methoxymalonyl CoA in host cells that 
are otherwise unable to produce such polyketides. 

In another embodiment, the invention provides a compound related in structure to 

1 5 FK-520 or FK-506 that is useful in the treatment of a medical condition. These compounds 
include compounds in which the C-13 methoxy group is replaced by a moiety selected from 
the group consisting of hydrogen, methyl, and ethyl moieties. Such compounds are less 
susceptible to the main in vivo pathway of degradation for FK-520 and FK-506 and related 
compounds and thus exhibit an improved pharmacokinetic profile. The compounds of the 

20 invention also include compounds in which the C-15 methoxy group is replaced by a moiety 
selected from the group consisting of hydrogen, methyl, and ethyl moieties. The compounds 
of the invention also include the above compounds further modified by chemical 
methodology to produce derivatives such as, but not limited to, the C-1S hydroxyl 
derivatives, which have potent neurotrophin but not immunosuppresion activities. 

25 Thus, the invention provides polyketides having the structure: 
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wherein, R, is hydrogen, methyl, ethyl, or allyl; R 2 is hydrogen or hydroxyl, provided that 
when R 2 is hydrogen, there is a double bond between C-20 and C-l 9; R ; is hydrogen or 
hydroxyl; R, is methoxyl, hydrogen, methyl, or ethyl; and R 5 is methoxyl, hydrogen, 
5 methyl, or ethyl; but not including FK-506, FK-520, 18-hydroxy-FK-520, and 18-hydroxy- 
FK-506. The invention provides these compounds in purified form and in pharmaceutical 
compositions. 

In another embodiment, the invention provides a method for treating a medical 
condition by administering a pharmaceutical^ efficacious dose of a compound of the 
10 invention. The compounds of the invention may be administered to achieve 
immunosuppression or to stimulate nerve growth and regeneration. 

These and other embodiments and aspects of the invention will be more fully 
understood after consideration of the attached Drawings and their brief description below, 
together with the detailed description, examples, and claims that follow. 

15 

Brief Description of the Drawings 
Figure 1 shows a diagram of the FK-520 biosynthetic gene cluster. The top line 
provides a scale in kilobase pairs (kb). The second line shows a restriction map with 
selected restriction enzyme recognition sequences indicated. K is Kpnl; X is Xhol, S is Sacl\ 

20 P is Pstl; and E is EcoRl. The third line indicates the position of FK-520 PKS and related 
genes. Genes are abbreviated with a one letter designation, i.e., C is fkbC, Immediately 
under the third line are numbered segments showing where the loading module (L) and ten 
different extender modules (numbered I - 1 0) are encoded on the various genes shown. At 
the bottom of the Figure, the DNA inserts of various cosmids of the invention (i.e., 34-124 

25 is cosmid pKOS034-124) are shown in alignment with the FK-520 biosynthetic gene 
cluster. 

Figure 2 shows the loading module (load), the ten extender modules, and the peptide 
synthetase domain of the FK-520 PKS, together with, on the top line, the genes that encode 
the various domains and modules. Also shown are the various intermediates in FK-520 
30 biosynthesis, as well as the structure of FK-520, with carbons 13, 15, 21, and 31 numbered. 
The various domains of each module and subdomains of the loading module are also shown. 
The darkened circles showing the DH domains in modules 2, 3, and 4 indicate that the 
dehydratase domain is not functional as a dehydratase; this domain may affect the 
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stereochemistry at the corresponding position in the poiyketide. The substituents on the FK- 
520 structure that result from the action of non-PKS enzymes are also indicated by arrows, 
together with the types of enzymes or the genes that code for the enzymes thai mediate the 
action. Although the methyltransferase is shown acting at the C- 13 and C-15 hydroxy! 
groups after release of the poiyketide from the PKS, the methyltransferase may act on the 2- 
hydroxymalonyl substrate prior to or contemporaneously with its incorporation during 
poiyketide synthesis. 

Figure 3 shows a close-up view of the left end of the FK-520 gene cluster, which 
contains at least ten additional genes. The ethyl side chain on carbon 21 of FK-520 (Figure 
2) is derived from an ethylmalonyl Co A extender unit that is incorporated by an 
ethylmalonyl specific AT domain in extender module 4 of the PKS. At least four of the 
genes in this region code for enzymes involved in ethylmalonyl biosynthesis. The 
polyhydroxybutyrate depolymerase is involved in maintaining hydroxybutyryl-CoA pools 
during FK-520 production. Polyhydroxybutyrate accumulates during vegetative growth and 
15 disappears during stationary phase in other Streptomyces (Ranade and Vining, 1993, Can. J. 
Microbiol 59:377). Open reading frames with unknown function are indicated with a 
question mark. 

Figure 4 shows a biosynthetic pathway for the biosynthesis of ethylmalonyl CoA 
from acetoacetyl CoA consistent with the function assigned to four of the genes in the FK- 
20 520 gene cluster shown in Figure 3. 

Figure 5 shows a close-up view of the right-end of the FK-520 PKS gene cluster 
(and of the sequences on cosmid pKOS065-C31 ). The genes shown include JkbD. fkbM (a 
methyl transferase that methylates the hydroxy I group on C-31 of FK-520),y)tWV (a 
homolog of a gene described as a regulator of cholesterol oxidase and that is believed to be 
25 a transcriptional activator) ? y£6£ (a type II thioesterase, which can increase poiyketide 
production levels), and fkbS (a crotonyl-CoA reductase involved in the biosynthesis of 
ethylmalonyl CoA). 

Figure 6 shows the proposed degradative pathway for tacrolimus (FK-506) 
metabolism. 

30 Figure 7 shows a schematic process for the construction of recombinant PKS genes 

of the invention that encode PKS enzymes that produce 13-desmethoxy FK-506 and FK- 
520 polyketides of the invention, as described in Example 4. below. 
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Figure 8. in Parts A and B, shows certain compounds of the invention preferred for 
application in Part A and a synthetic route for making those compounds in Part B. 



Given the valuable pharmaceutical properties of polyketides. there is a need for 
methods and reagents for producing large quantities of polyketides, as well as for producing 
related compounds not found in nature. The present invention provides such methods and 
reagents, with particular application to methods and reagents for producing the polyketides 
known as FK-520, also known as ascomycin or L-683,590 (see Holt et aL, 1993, JACS 
115:9925), and FK-506, also known as tacrolimus. Tacrolimus is a macrolide 
immunosuppressant used to prevent or treat rejection of transplanted heart, kidney, liver, 
lung, pancreas, and small bowel allografts. The drug is also useful for the prevention and 
treatment of graft-versus-host disease in patients receiving bone marrow transplants, and for 
the treatment of severe, refractory uveitis. There have been additional reports of the 
unapproved use of tacrolimus for other conditions, including alopecia universalis, 
autoimmune chronic active hepatitis, inflammatory bowel disease, multiple sclerosis, 
primary biliary cirrhosis, and scleroderma. The invention provides methods and reagents for 
making novel polyketides related in structure to FK-520 and FK-506, and structurally 
related polyketides such as rapamycin. 

The FK-506 and rapamycin polyketides are potent immunosuppressants, with 
chemical structures shown below. 



Detailed Description of the Invention 
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FK-520 differs from FK-506 in that it lacks the aliyi group at C-21 of FK-506, having 
instead an ethyl group at that position, and has similar activity to FK-506, albeit reduced 
immunosuppressive activity. 

These compounds act through initial formation of an intermediate complex with 
5 protein "immunophilins" known as FKBPs (FK-506 binding proteins), including FKBP-12. 
Immunophilins are a class of cytosolic proteins that form complexes with molecules such as 
FK-506, FK-520, and rapamycin that in turn serve as ligands for other cellular targets 
involved in signal transduction. Binding of FK-506, FK-520, and rapamycin to FKBP 
occurs through the structurally similar segments of the polyketide molecules, known as the 
1 0 "FKBP-binding domain" (as generally but not precisely indicated by the stippled regions in 
the structures above). The FK-506-FKBP complex then binds calcineurin, while the 
rapamycin-FKBP complex binds to a protein known as RAFT-1. Binding of the FKBP- 
polyketide complex to these second proteins occurs through the dissimilar regions of the 
drugs known as the "effector" domains. 




^ > Immunost^prosston 



15 




RAFT 



The three component FKBP-polyketide-effector complex is required for signal 
transduction and subsequent immunosuppressive activity of FK-506. FK-520, and 
rapamycin. Modifications in the effector domains of FK-506, FK-520, and rapamycin that 
destroy binding to the effector proteins (calcineurin or RAFT) lead to loss of 
20 immunosuppressive activity, even though FKBP binding is unaffected. Further, such 

analogs antagonize the immunosuppressive effects of the parent polyketides, because they 
compete for FKBP. Such non-immunosuppressive analogs also show reduced toxicity (see 
Dumont et at., 1 992, Journal of Experimental Medicine 1 76 % 75 1 -760), indicating that much 
ol thc toxicity of these drugs is not linked to FKBP binding. 
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In addition to immunosuppressive activity, FK-520, FK-506, and rapamycin have 
neurotrophic activity. In the central nervous system and in peripheral ne^es, immunophilins 
are referred to as "neuroimmunophilins". The neuroimmunophiiin FKBP is markedly 
enriched in the central nervous system and in peripheral nerves. Molecules that bind to the 
neuroimmunophiiin FKBP, such as FK-506 and FK-520, have the remarkable effect of 
stimulating nerve growth. In vitro, they act as neurotrophic, i.e., they promote neunte 
outgrowth in NGF-treated PC 12 cells and in sensory neuronal cultures, and in intact 
animals, they promote regrowth of damaged facial and sciatic nerves, and repair lesioned 
serotonin and dopamine neurons in the brain. See Gold et aL Jun. 1999, J. Pharm. Exp. 
Ther. 289(3): 1202-1210; Lyons et aL, 1994, Proc. National Academy of Science 91: 3191- 
3195; Gold et aL, 1995, Journal of Neuroscience 15: 7509-7516; and Steiner et aL, 1997, 
Proc. National Academy of Science 94: 2019-2024. Further, the restored central and 
peripheral neurons appear to be functional. 

Compared to protein neurotrophic molecules (BNDF, NGF, etc.), the small- 
1 5 molecule neurotrophins such as FK-506, FK-520, and rapamycin have different, and often 
advantageous, properties. First, whereas protein neurotrophins are difficult to deliver to 
their intended site of action and may require intra-cranial injection, the small-molecule 
neurotrophins display excellent bioavailability; they are active when administered 
subcutaneously and orally. Second, whereas protein neurotrophins show quite specific 
20 effects, the small-molecule neurotrophins show rather broad effects. Finally, whereas 
protein neurotrophins often show effects on normal sensory nerves, the small-molecule 
neurotrophins do not induce aberrant sprouting of normal neuronal processes and seem to 
affect damaged nerves specifically. Neuroimmunophiiin ligands have potential therapeutic 
utility in a variety of disorders involving nerve degeneration (e.g. multiple sclerosis, 
25 Parkinson's disease, Alzheimer's disease, stroke, traumatic spinal cord and brain injury, 
peripheral neuropathies). 

Recent studies have shown that the immunosuppressive and neurite outgrowth 
activity of FK-506, FK-520, and rapamycin can be separated; the neuro regenerative activity 
in the absence of immunosuppressive activity is retained by agents which bind to FKBP but 
30 not to the effector proteins calcineurin or RAFT. See Steiner et aL, 1997, Nature Medicine 
3: 421-428. 
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Available structure-activity data show that the important features for neurotrophic 
activity of rapamycin, FK-520, and FK-506 lie within the common, contiguous segments of 
the macroiide ring that bind to FKBP. This portion of the molecule is termed the "FKBP 
binding domain" (see VanDuyne et al., 1 993, Journal of Molecular Biology 229: 1 05- 1 24.). 
Nevertheless, the effector domains of the parent macroiides contribute to conformational 
rigidity of the binding domain and thus indirectly contribute to FKBP binding. 




.OMe 
*'OH 

"FKBP binding domain" 

There are a number of other reported analogs of FK-506, FK-520, and rapamycin that bind 
to FKBP but not the effector protein calcineurin or RAFT. These analogs show effects on 
nerve regeneration without immunosuppressive effects. 

Naturally occurring FK-520 and FK-506 analogs include the antascomycins. which 
are FK-506-like macroiides that lack the functional groMps of FK-506 that bind to 
calcineurin (see Fehr ei at.* 1996, The Journal of Antibiotics 49: 230-233). These molecules 
1 5 bind FKBP as effectively as does FK-506; they antagonize the effects of both FK-506 and 
rapamycin, yet lack immunosuppressive activity. 
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Antascomycin A 

Other analogs can be produced by chemically modifying FK-506, FK-520, or 
rapamycin. One approach to obtaining neuroimmunophilin ligands is to destroy the effector 
binding region of FK-506, FK-520, or rapamycin by chemical modification. While the 
5 chemical modifications permitted on the parent compounds are quite limited, some useful 
chemically modified analogs exist. The FK-520 analog L-685,818 (ED 50 = 0.7 nM for 
FKBP binding; see Dumont et aL, 1992), and the rapamycin analog WAY- 124,466 (IC 50 = 
12.5 nM; see Ocain et al. 9 1993, Biochemistry Biophysical Research Communications J 92: 
1340-134693) are about as effective as FK-506, FK-520, and rapamycin at promoting 
10 neurite outgrowth in sensory neurons (see Steiner et aL, 1997). 




L-685.818 WAY-124,466 



One of the few positions of rapamycin that is readily amenable to chemical 
modification is the allylic 16-methoxy group; this reactive group is readily exchanged by 
acid-catalyzed nucleophilic substitution. Replacement of the 16-methoxy group of 
rapamycin with a variety of bulky groups has produced analogs showing selective Joss of 
immunosuppressive activity while retaining FKBP-binding (see Luengo et al., 1995, 
Chemistry & Biology 2: 471-481). One of the best compounds. 1, below, shows complete 
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loss of activity in th~ splenocyte proliferation assay with only a 10-fold reduction in binding 
to FKBP. 




5 There are also synthetic analogs of FKBP binding domains. These compounds 

reflect an approach to obtaining neuroimmunophilin ligands based on "rationally designed" 
molecules that retain the FKBP-binding region in an appropriate conformation for binding 
to FKBP, but do not possess the effector binding regions. In one example, the ends of the 
FKBP binding domain were tethered by hydrocarbon chains (see Holt et al. y 1993, Journal 
10 of the American Chemical Society 115: 9925-9938); the best analog, 2, below, binds to 
FKBP about as well as FK-506. In a similar approach, the ends of the FKBP binding 
domain were tethered by a tripeptide to give analog 3, below, which binds to FKBP about 
20-fold poorer than FK-506. These compounds arc anticipated to have neuroimmunophilin 
binding activity. 



15 




2 3 



In a primate MPTP model of Parkinson's disease, administration of FKBP ligand 
GPI-1046 caused brain cells to regenerate and behavioral measures to improve. MPTP is a 
neurotoxin, which, when administered to animals, selectively damages nigral-striatal 
20 dopamine neurons in the brain, mimicking the damage caused by Parkinson's disease. 
Whereas, before treatment, animals were unable to use affected limbs, the FKBP ligand 
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restored the ability of animals to feed themselves and gave improvements in measures of 
locomotor activity, neurological outcome, and fine motor control. There were also 
corresponding increases in regrowth of damaged nerve terminals. These results demonstrate 
the utility of FKBP Iigands for treatment of diseases of the CNS. 
^ From the above description, two general approaches towards the design of non- 

immunosuppressant. neuroimmunophilin Iigands can be seen. The first involves the 
construction of constrained cyclic analogs of FK-506 in which the FKBP binding domain is 
fixed in a conformation optimal for binding to FKBP. The advantages of this approach are 
that the conformation of the analogs can be accurately modeled and predicted by 
10 computational methods, and the analogs closely resemble parent molecules that have proven 
pharmacological properties. A disadvantage is that the difficult chemistry limits the 
numbers and types of compounds that can be prepared. The second approach involves the 
trial and error construction of acyclic analogs of the FKBP binding domain by conventional 
medicinal chemistry. The advantages to this approach are that the chemistry is suitable for 
15 production of the numerous compounds needed for such interactive chemistry-bioassay 
approaches. The disadvantages are that the molecular types of compounds that have 
emerged have no known history of appropriate pharmacological properties, have rather 
labile ester functional groups, and are too conformationally mobile to allow accurate 
prediction of conformational properties. 
20 The present invention provides useful methods and reagents related to the first 

approach, but with significant advantages. The invention provides recombinant PKS genes 
that produce a wide variety of polyketides that cannot otherwise be readily synthesized by 
chemical methodology alone. Moreover, the present invention provides polyketides that 
have either or both of the desired immunosuppressive and neurotrophic activities, some of 
25 which are produced only by fermentation and others of which are produced by fermentation 
and chemical modification. Thus, in one aspect, the invention provides compounds that 
optimally bind to FKBP but do not bind to the effector proteins. The methods and reagents 
of the invention can be used to prepare numerous constrained cyclic analogs of FK-520 in 
which the FKBP binding domain is fixed in a conformation optimal for binding to FKBP. 
30 Such compounds will show neuroimmunophilin binding (neurotrophic) but not 

immunosuppressive effects. The invention also allows direct manipulation of FK-520 and 
related chemical structures via genetic engineering of the enzymes involved in the 
biosynthesis of FK-520 (as well as related compounds, such as FK-506 and rapamycin); 
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similar chemical modifications are simply not possible because of the complexity of the 
structures. The invention can also be used to introduce "chemical handles" into normally 
inert positions that permit subsequent chemical modifications. 

Several general approaches to achieve the development of novel neuroimmunophilin 
5 ligands are facilitated by the methods and reagents of the present invention. One approach is 
to make "point mutations" of the functional groups of the parent FK-520 structure that bind 
to the effector molecules to eliminate their binding potential. These types of structural 
modifications are difficult to perform by chemical modification, but can be readily 
accomplished with the methods and reagents of the invention. 
' 0 A second, more extensive approach facilitated by the present invention is to utilize 

molecular modeling to predict optimal structures ab initio that bind to FKBP but not 
effector molecules. Using the available X-ray crystal structure of FK-520 (or FK-506) 
bound to FKBP, molecular modeling can be used to predict polyketides that should 
optimally bind to FKBP but not calcineurin. Various macrolide structures can be generated 
1 5 by linking the ends of the FKBP-binding domain with "all possible" polyketide chains of 
variable length and substitution patterns that can be prepared by genetic manipulation of the 
FK-520 or FK-506 PKS gene cluster in accordance with the methods of the invention. The 
ground state conformations of the virtual library can be determined, and compounds that 
possess binding domains most likely to bind well to FKBP can be prepared and tested. 
20 Once a compound is identified in accordance with the above approaches, the 

invention can be used to generate a focused library of analogs around the lead candidate, to 
"fine tune" the compound for optimal properties. Finally, the genetic engineering methods 
of the invention can be directed towards producing "chemical handles" that enable 
medicinal chemists to modify positions of the molecule previously inert to chemical 
25 modification. This opens the path to previously prohibited chemical optimization of lead 
compounds by time-proven approaches. 

Moreover, the present invention provides polyketide compounds and the 
recombinant genes for the PKS enzymes that produce the compounds that have significant 
advantages over FK-506 and FK-520 and their analogs. The metabolism and 
30 pharmacokinetics of tacrolimus has been exstensively studied, and FK-520 is believed to be 
similar in these respects. Absorption of tacrolimus is rapid, variable, and incomplete from 
the gastrointestinal tract (Harrison's Principles of Internal Medicine, 14th edition, 1998, 
McGraw Hill, 14, 20, 21, 64-67). The mean bioavailability of the oral dosage form is 27%. 
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(range 5 to 65%). The volume of distribution ( VolD) based on plasma is 5 to 65 L per kg of 
body weight (L/kg), and is much higher than the VolD based on whole blood 
concentrations, the difference reflecting the binding of tacrolimus to red blood cells. Whol 
blood concentrations may be 12 to 67 times the plasma concentrations. Protein binding is 
5 high (75 to 99%), primarily to albumin and alphal-acid glycoprotein. The half-life for 

distribution is 0.9 hour; elimination is biphasic and variable: terminal-1 1.3 hr (range, 3.5 to 
40.5 hours). The time to peak concentration is 0.5 to 4 hours after oral administration. 

Tacrolimus is metabolized primarily by cytochrome P450 3A enzymes in the liver 
and small intestine. The drug is extensively metabolized with less than 1% excreted 
1 0 unchanged in urine. Because hepatic dysfunction decreases clearance of tacrolimus, doses 
have to be reduced substantially in primary graft non- function, especially in children. In 
addition, drugs that induce the cytochrome P450 3 A enzymes reduce tacrolimus levels, 
while drugs that inhibit these P450s increase tacrolimus levels. Tacrolimus bioavailability 
doubles with co-administration of ketoconazole, a drug that inhibits P450 3 A. See, Vincent 
1 5 et al. y 1992, In vitro metabolism of FK-506 in rat, rabbit, and human liver microsomes: 
Identification of a major metabolite and of cytochrome P450 3 A as the major enzymes 
responsible for its metabolism, Arch. Biochem. Biophys. 294: 454-460; Iwasaki et al, % 1993, 
Isolation, identification, and biological activities of oxidative metabolites of FK-506, a 
potent immunosuppressive macrolide lactone, Drug Metabolism & Disposition 21: 971-977; 
20 Shiraga et ai. 1994, Metabolism of FK-506, a potent immunosuppressive agent, by 
cytochrome P450 3 A enzymes in rat, dog, and human liver microsomes, Biochem. 
Pharmacol. 47: 727-735; and Iwasaki et aL, 1995, Further metabolism of FK-506 
(Tacrolimus); Identification and biological activities of the metabolites oxidized at multiple 
sites of FK-506, Drug Metabolism & Disposition 23: 28-34. The cytochrome P450 3 A 
25 subfamily of isozymes has been implicated as important in this degradative process. 

Structures of the eight isolated metabolites formed by liver microsomes are shown in 
Figure 6. Four metabolites of FK-506 involve demethylation of the oxygens on carbons 13, 
15, and 31, and hydroxylation of carbon 12. The 13-demethylated (hydroxy) compounds 
undergo cyclizations of the 13-hydroxy at C-10 to give MI, MVI and MVII, and the 12- 
30 hydroxy metabolite at C-10 to give I. Another four metabolites formed by oxidation of the 
four metabolites mentioned above were isolated by liver microsomes from dexamethasone 
ireatcd rats. Three of these are metabolites doubly demethylated at the methoxy groups on 
carbons 15 and 31 (M-V), 13 and 31 (M-VI), and 13 and 15 (M-VII). The fourth, M-VIII, 
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was the metabolite produced after demethylation of the 31-methoxy group, followed by 
formation of a fused ring system by further oxidation. Among the eight metabolites. M-II 
has immunosuppressive activity comparable to that of FK-506, whereas the other 
metabolites exhibit weak or negligible activities. Importantly, the major metabolite of 
human, dog, and rat liver microsomes is the 13-demethylated and cyclized FK-506 (M-I). 

Thus, the major metabolism of FK-506 proceeds via 13-demethylation followed by 
cyclization to the inactive M-I, this representing about 90% of the metabolic products after a 
10 minute incubation with liver microsomes. Analogs of tacrolimus that do not possess a C- 
13 methoxy group would not be susceptible to the first and most important 
biotransformation in the destructive metabolism of tacrolimus (i.e. cyclization of 13- 
hydroxy to C-10). Thus, a 13-desmethoxy analog of FK-506 should have a longer half-life 
in the body than does FK-506. The C-13 methoxy group is believed not to be required for 
binding to FKBP or calcineurin. The C-13 methoxy is not present on the identical position 
of rapamycin, which binds to FKBP with equipotent affinity as tacrolimus. Also, analysis of 
the 3-dimensional structure of the FKBP-tacrolimus-calcineurin complex shows that the C- 
13 methoxy has no interaction with FKBP and only a minor interaction with calcineurin. 
The present invention provides C- 13-desmethoxy analogs of FK-506 and FK-520, as well 
as the recombinant genes that encode the PKS enzymes that catalyze their synthesis and 
host cells that produce the compounds. 

These compounds exhibit, relative to their naturally occurring counterparts, 
prolonged immunosuppressive action in vivo, thereby allowing a lower dosage and/or 
reduced frequency of administration. Dosing is more predictable, because the variability in 
FK-506 dosage is largely due to variation of metabolism rate. FK-506 levels in blood can 
vary widely depending on interactions with drugs that induce or inhibit cytochrome P450 
3 A (summarized in USP Drug Information for the Health Care Professional). Of particular 
importance are the numerous drugs that inhibit or compete for CYP 3 A, because they 
increase FK-506 blood levels and lead to toxicity (Prograf package insert, FujisawaliUS. 
Rev 4/97, Rec 6/97). Also important are the drugs that induce P450 3A (e.g. 
Dexamethasone), because they decrease FK-506 blood levels and reduce efficacy. Because 
the major site of CYP 3A action on FK-506 is removed in the analogs provided by the 
present invention, those analogs are not as susceptible to drug interactions as the naturally 
occurring compounds. 
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Hyperglycemia, nephrotoxicity, and neurotoxicity are the most significant adverse 
effects resulting from the use of FK-506 and are believed to be similar for FK-520. Because 
these effects appear to occur primarily by the same mechanism as the immunosuppressive 
action (i.e. FKBP-calcineurin interaction), the intrinsic toxicity of the desmethoxy analogs 
may be similar to FK-506. However, toxicity of FK-506 is dose related and correlates with 
high blood levels of the drug (Prograf package insert, FujisawalUS, Rev 4/97, Rec 6/97). 
Because the levels of the compounds provided by the present invention should be more 
controllable, the incidence of toxicity should be significantly decreased with the 13- 
desmethoxy analogs. Some reports show that certain FK-506 metabolites are more toxic 
than FK-506 itself, and this provides an additional reason to expect that a CYP 3A resistant 
analog can have lower toxicity and a higher therapeutic index. 

Thus, the present invention provides novel compounds related in structure to FK- 
506 and FK-520 but with improved properties. The invention also provides methods for 
making these compounds by fermentation of recombinant host cells, as well as the 
1 5 recombinant host cells, the recombinant vectors in those host cells, and the recombinant 
proteins encoded by those vectors. The present invention also provides other valuable 
materials useful in the construction of these recombinant vectors that have many other 
important applications as well. In particular, the present invention provides the FK-520 PKS 
genes, as well as certain genes involved in the biosynthesis of FK-520 in recombinant form. 
20 FK-520 is produced at relatively low levels in the naturally occurring cells, 

Streptomyces hygroscopicus var. ascomyceticus. in which it was first identified. Thus, 
another benefit provided by the recombinant FK-520 PKS and related genes of the present 
invention is the ability to produce FK-520 in greater quantities in the recombinant host cells 
provided by the invention. The invention also provides methods for making novel FK-520 
25 analogs, in addition to the desmethoxy analogs described above, and derivatives in 
recombinant host cells of any origin. 

The biosynthesis of FK-520 involves the action of several enzymes. The FK-520 
PKS enzyme, which is composed of the JkbA,JkbB,JkbC, and fkbP gene products, 
synthesizes the core structure of the molecule. There is also a hydroxylation at C-9 mediated 
30 by the P450 hydroxylase that is the fkbD gene product and that is oxidized by the jkbO gene 
product to result in the formation of a keto group at C-9. There is also a methylation at C-31 
that is mediated by an O-methyltransferase that is the fkbM gene product. There are also 
methylations at the C-I3 and C-15 positions by a methyltransferase believed to be encoded 
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by the fkbG gene; this methyltransferase may act on the hydroxymalonyl CoA substrates 
prior to binding of the substrate to the AT domains of the PKS during polyketide synthesis. 
The present invention provides the genes encoding these enzymes in recombinant form. The 
invention also provides the genes encoding the enzymes involved in ethylmalonyl CoA and 
2-hydroxymalonyl CoA biosynthesis in recombinant form. Moreover, the invention 
provides Streptomyces hygroscopicus var. ascomyceticus recombinant host cells lacking 
one or more of these genes that are useful in the production of useful compounds. 

The cells are useful in production in a variety of ways. First, certain cells make a 
useful FK-520-related compound merely as a result of inactivation of one or more of the 
FK-520 biosynthesis genes. Thus, by inactivating the C-31 O-methyltransferase gene in 
Streptomyces hygroscopicus var. ascomyceticus, one creates a host cell that makes a 
desmethyl (at C-31) derivative of FK-520. Second, other ceils of the invention are unable to 
make FK-520 or FK-520 related compounds due to an inactivation of one or more of the 
PKS genes. These cells are useful in the production of other polyketides produced by PKS 
1 5 enzymes that are encoded on recombinant expression vectors and introduced into the host 
cell. 

Moreover, if only one PKS gene is inactivated, the ability to produce FK-520 or an 
FK-520 derivative compound is restored by introduction of a recombinant expression vector 
that contains the functional gene in a modified or unmodified form. The introduced gene 

20 produces a gene product that, together with the other endogenous and functional gene 

products, produces the desired compound. This methodology enables one to produce FK- 
520 derivative compounds without requiring that all of the genes for the PKS enzyme be 
present on one or more expression vectors. Additional applications and benefits of such 
cells and methodology will be readily apparent to those of skill in the art after consideration 

25 of how the recombinant genes were isolated and employed in the construction of the 
compounds of the invention. 

The FK-520 biosynthetic genes were isolated by the following procedure. Genomic 
DNA was isolated from Streptomyces hygroscopicus var. ascomyceticus (ATCC 14891) 
using the lysozyme/proteinase K protocol described in Genetic Manipulation of 

30 Streptomyces - A Laboratory Manual (Hopwood et ai, 1986). The average size of the DNA 
was estimated to be between 80 - 120 kb by electrophoresis on 0.3% agarose gels. A library 
was constructed in the SuperCos™ vector according to the manufacturer's instructions and 
with the reagents provided in the commercially available kit (Stratagene). Briefly, 100 \ig of 



SUBSTITUTE SHEET (RULE 26) 



BNSDOCID: <WO 0020601 A2 IA> 



WO 00/2060 1 PCT/US99/22886 

22 

genomic DNA was partially digested with 4 units of Saul A I for 20 min. in a reaction 
volume of 1 mL. and the fragments were dephosphorylated and ligated to SuperCos vector 
arms. The ligated DNA was packaged and used to infect log-stage XLl-BIueMR cells. A 
library of about 1 0,000 independent cosmid clones was obtained. 
5 Based on recently published sequence from the FK-506 cluster (Motamedi and 

Shafiee, 1998, Eur, J, Biochem. 256: 528), a probe for the fkbO gene was isolated from 
ATCC 14891 using PCR with degenerate primers. With this probe, a cosmid designated 
PKOS034-124 was isolated from the library. With probes made from the ends of cosmid 
PKOS034-124, an additional cosmid designated pKOS034-120 was isolated. These cosmids 
1 0 ( pKOS034- 1 24 and pKOS034- 1 20) were shown to contain DNA inserts that overlap with 
one another. Initial sequence data from these two cosmids generated sequences similar to 
sequences from the FK-506 and rapamycin clusters, indicating that the inserts were from. the 
FK-520 PKS gene cluster. Two EcoRl fragments were subcloned from cosmids pKOS034- 
124 and pKOS034-120. These subclones were used to prepare shotgun libraries by partial 
1 5 digestion with Sau3 Al, gel purification of fragments between 1 .5 kb and 3 kb in size, and 
ligation into the pLitmus28 vector (New England Biolabs). These libraries were sequenced 
using dye terminators on a Beckmann CEQ2000 capillary electrophoresis sequencer, 
according to the manufacturer's protocols. 

To obtain cosmids containing sequence on the left and right sides of the sequenced 
20 region described above, a new cosmid library of ATCC 14891 DNA was prepared 

essentially as described above. This new library was screened with a new JkbM probe 
isolated using DNA from ATCC 14891. A probe representing the jkbP gene at the end of 
cosmid pKOS034-124 was also used. Several additional cosmids to the right of the 
previously sequenced region were identified. Cosmids pKOS065-C31 and pKOS065-C3 
25 were identified and then mapped with restriction enzymes. Initial sequences from these 
cosmids were consistent with the expected organization of the cluster in this region. More 
extensive sequencing showed that both cosmids contained in addition to the desired 
sequences, other sequences not contiguous to the desired sequences on the host cell 
chromosomal DNA. Probing of additional cosmid libraries identified two additional 
30 cosmids, pKOS065-M27 and pKOS065-M21, that contained the desired sequences in a 
contiguous segment of chromosomal DNA. Cosmids pKOS034-124, pKOS034-120, 
pKOS065-M27, and pKOS065-M21 have been deposited with the American Type Culture 
Collection, Manassas. VA, USA. The complete nucleotide sequence of the coding 
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sequences of the genes that encode the proteins of the FK-520 PKS are shown below but 
can also be determined from the cosmids of the invention deposited with the ATCC using 
standard methodology. 

Referring to Figures 1 and 3, the FK-520 PKS gene cluster is composed of four open 
reading frames designatedfkbB,JkbC,JkbA, and JJcbP. ThtJkbB open reading frame encodes 
the loading module and the first four extender modules of the PKS. The fkbC open reading 
frame encodes extender modules five and six of the PKS. The JkbA open reading frame 
encodes extender modules seven, eight, nine, and ten of the PKS. The JkbP open reading 
frame encodes the NRPS of the PKS. Each of these genes can be isolated from the cosmids 
of the invention described above. The DNA sequences of these genes are provided below 
preceded by the following table identifying the start and stop codons of the open reading 
frames of each gene and the modules and domains contained therein. 



Nucleotides 

15 complement (412 - 1836) 
complement (2020 - 3579) 
complement (3969 - 4496) 
complement (4595 - 5488) 
5601 - 6818 

20 6808 - 8052 
8156- 8824 

complement (9122 - 9883) 
complement (9894 - 1 0994) 
complement (10987 - 1 1247) 

25 complement ( 1 1 244 - 1 2092) 
complement ( 1 2 1 1 3 - 1 3 1 50) 
complement (13212 - 23988) 
complement (23992 - 46573) 
46754 - 47788 

30 47785 - 52272 
52275 - 71465 
71462 - 72628 
72625 - 73407 

complement (73460 - 76202) 
35 complement (76336 - 77080) 
complement (77076 - 77535) 
complement (44974 - 46573) 
complement (43777 - 44629) 
complement (43144 - 43660) 
40 complement ( 4 1 842 - 43093 ) 
complement(40609 - 41842) 
complement (39442 - 40609) 
complement (38677 - 39307) 
complement (38371 - 38581) 



Gene or Domain 

JkbW 

JkbV 

JkbR2 

JkbRl 

JkbE 

JkbF 

fkbG 

JkbH 

fkbl 

JkbJ 

fkbK 

fkbl 

fkbC 

fkbB 

fkbO 

JkbP 

JkbA 

fkbD 

fkbM 

JkbN 

JkbQ 

fkbS 

CoA ligase of loading domain 

ER of loading domain 

ACP of loading domain 

KS of extender module 1 (KS1) 

ATI 

DH1 

KR1 

ACPI 
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complement (37145 


- 3S296) 


KS2 




complement (35749 


- 37144) 


AT2 




complement (34606 


- 35749) 


DH2 (inactive) 




complement (33823 ■ 


- 34480) 


KR2 


5 


complement (33505 ■ 


-33715) 


ACP2 




complement (32185 - 


• 33439) 


K.S3 




complement (31018 - 


•32185) 


AT3 




complement (29869 - 


31018) 


DH3 (inactive) 




complement (29092 - 


29740) 


KR3 


10 


complement (28750 - 


28960) 


ACP3 




complement (27430 - 


28684) 


KS4 




complement (26146 - 


27430) 


AT4 




complement (24997 - 


26146) 


DH4 (inactive) 




complement (24163 - 


24373) 


ACP4 


15 


complement (22653 - 


23892) 


KS5 




complement (21420 - 


22653) 


AT5 




complement (20241 - 


21420) 


DH5 




complement (19464 - 


20097) 


KLR5 




complement (191 16 - 


19326) 


ACP5 


20 


complement (17820 - 


19053) 


KS6 




complement (16587 - 


1 7820) 


AT6 




complement (15438 - 


16587) 


DH6 




complement ( 1 45 1 7 - 


15294) 


ER6 




complement (13761 - 


14394) 


KR6 


25 


complement (13452 - 


13662) 


ACP6 




52362 - 53576 




KS7 




53577 - 54716 




AT7 




54717-55871 




DH7 




56019-56819 




ER7 


30 


56943 - 57575 




KR7 




57710- 57920 




ACP7 




57990 - 59243 




KS8 




59244 - 60398 




AT8 




60399 - 61412 




' DH8 (inactive) 


35 


61548 - 62180 




KR8 




62328 - 62537 




ACP8 




62598 - 63854 




KS9 




63855 - 65084 




AT9 




65085 - 66254 




DH9 


40 


66399 -67175 




ER9 




67299 -67931 




KR9 




68094 - 68303 




ACP9 




68397 - 69653 




KS10 




69654 - 70985 




AT10 


45 


71064 - 71273 




ACP10 



- GATCTCAGGC ATGAAGTCCT CCAGGCGAGG CGCCGAGGTG GTGAACACC? CGCCCCTCC7 
cl 7G7ACGGACC AC77CAC7CA CCGGCGA77G CGGAACCAAG TCATCCGGAA TAAAGGGCGJ 
121 TTACAAGATC C7CACATTGC GCGACCGCGA GCATACGCTG AGTTGCCTCA GAGGCAAACC 
50 :?1 GAAAGGGCGC GGGCGGTCCG CACCAGGGCG GAGTACGCGA CGAGAGTGGC GCACZZSCCC 
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24 1 ACCGTCACC7 CTCTCCCCC3 CCGGCGGGAT GCCCG3CG7G ACACGGTTGG G— — r 
301 ACGC7GAACA CCCGCGCGGT GTGGCGTCGG GGACACC3CC TGGCATCGGG CGGGTGACGG 
joI 7ACGGGGAGG GCG7ACGGCG GCCGTGGC7C G73C7CA.CGG CCGCCGGGGG 
A?\ GA3ACGGCAC 7CGGCGAG 7A GGGACGCC7G 3733GCACCT GCGGGCCGGA 
:> 481 G TrcCGGGC GGGCGCTGGC CGGTGGTGAG CCA. 3C7C7CC AGGGCGGTGA AGGC^GAGCG 

34 1 G.3ACACGGC AGCAAAGGCC GGAG7CGG7C GG3GAAGG7G TCGACGAGGG "t C GG"G— 
oCl C37GCCG7CC 7CGATGCGG7 AGTAGCGGTA CCGGCC GCCA GGCCGCTGCC GGACAT^"-*"-C 
bol GC37ACACG7 CGGAGCCC3G GCGGCAGGCA GCAGCACG7C G-GAGTG^^ — T - — -~~ 
721 CAGCGGC7TC CCGATACGAC CGGTCAACGC GA7GCG77CC AZGGCCGCG7 GGACGZZZzl 
1U 7yl GGAGCGGGTG GCGTAGTCG7 AGTCGGCATC GCAGCCCGGG ACCGTCCCCG GGGCGCW* 



34 1 CGGTGTGCCG GCTTCCTTCT CCCCATCGAA GZCGGGGTCG AACTCCTCGC GGTAGACGC 




jTTGGGAC CCTCCGCGCC 

1081 Cw^CAGGGTG CCTTCCCAG7 CGACTCCTCC G7CG7ACAGC 7CGGGA7GGT 7C7CCAGC7G 
114 1 CCAGCGCACG AGG7AGCCGC CGTTGGACAT CCCGG7GACC AGGGTGCGCT CGAGCGGCCG 
1201 G7GG7AGCGC 7GGGCGACCG ACGCGCGGGC GGCCCGGGTC AGCTGGG7GA GGCGGGTG T T 
1261 CCAC7CGGCG ACGGCGTCGC CCGGCCGGGA GCCA7CACGG TAGAACGCGG GGCGGGTGTT 
1321 GCCCTTG7CG GTGGCGGCGT AGGCGTAACC GCGGGCGAGC ACCCAGTCGG CGATGGCCCG 
1381 G7CGTTGGCG TACTGCTCGC GGTTACCGGG GGTGCZGGCC ACGACCAGGC CACCGTTCCA 
144 1 GCGGTCGGGC AGCCGGATGA CGAACTGGGC GTCG7GG7TC CACCCGTGGT T(3GTGTTGG T 
1501 GG7GGAGGTG TCGGGGAA.G7 AGCCGTCGAT C7GGA7CCCG GGCACTCCGG TGGGAGTGG^ 
1561 CAGGT7C7TG GGCGTCAGCC C7GCCCAG7C CGCCGGG7CG GTGTGGCCGG 7GGCCGCCGT 
1621 TCCCGCCGTG GTCAGC7CG7 CCAGGCAGTC GGCC7GC7GA CGTGCCGCCG CCGGGACACG 
25 1681 CAGCTGGGAC AGACGGGCGC AGTGACCG7C CGGGGCA7CG GGAGCAGGCC GGGCCG7GGC 

17 4 1 CGG7GAGGGG AGCAGGACGG CGAC7GCGGC CAGGGTGAGA GCGCCGAGGC CGGTGCG7CT 
1801 TC7CGGGGCC CG7CCGACAC CGAGGGGCAG AACCA73GAG AGCC7CCAGA CGTGCGGATG 
-1861 GA7GACGGAC TGGAGGC7AG G7CGCGCACG GTGGAGACGA ACA7GGGTGC GCCCGCCATG 
1921 AC7GAGGCCC C7CAGAGG7G GGCCGCCGCC ATGACGGGCG CGGGACCGCG GGCGC7CCGG 
30 1981 GGCGGTGCCC GCGGCCGCCA CCGGT7CCGG GTCCZZGGGT CAGGGACAGG TGTCG77CGC 

204 1 GACGGTGAAG TAGCCGG7CG GCGACTCTTT CAAGG7GGTC G7GACGAAGG TGTTGTACAG 
2101 GCCCATGTTC TGGCCGGAGC CCTTGGCGTA GG7G7AACCG GCGCTCGTCG TGGCGCGGCC 
2161 CGCCTGGACG 7GAGCGTAG7 7GCCGGCGGT CCAGCAGACG GCCGTGGCAC CGGTCGTCTG 
2221 CGCGG7GACC GCGCCCGAGA GCGGTCCGGC CTTGCCG7CC GCG7CCCGGG CGGCGACCGC 
35 2281 GTAGGTG7GC GATGTGCCCG CCCTCAGGCC GG7GTCCGTG TACGACGTCG TGGCGGACGT 

234 1 GG7GATC7GG GCACCG7CGC GGTGGACGGC GTAGTCGGTG GCGCCGTCGA CGGGTT7CCA 
24 01 GG7CAGGC7G ATGGTGGTG7 CGGTGGCGCC GGTGGCGGCC AGGCCGGACG GAGCGGGCAG 
24 61 CGAACCGGGG TCGGAGGCGG A7CCGC7CAG GCCGAAGAAC TGCGTGATCC AGTAGCTGGA 
2 521 ACAGATCGAG TCCAGGAAG7 AGGCGGCGCC GG7GC7GCCG CACTGCTG7G C7CCGG7GCC 
40 2 581 GG3A7CGACC GGGGTGCCG7 GCCCGA7GCC CGGCACCCGG 7TCACCTCCA CGGCCACCGA 
264 1 TCCGTCCGCG GCCAGGTAC7 CC7CG7GCCG GGTGGAGTTC GGGCCGATCA CCGAGGTACG 
27 01 GTCCGGCGTC TGGGACACGC CG7GCACAGC GG7CCACTGG 7CGCGCAACT CG7CGGCG7T 
2 7 61 GCGCGGCGCG ACGGTGG7G7 CC7TGTCGCC G7GCCAGA7G GCCACGCGCG GCCACGGGCC 
2 821 CGACCACGAG GGG7AGCCG7 CACGGACCCG ZCGCZZZCAC 7GGTCCGCGG TCAGGTCGGT 
45 2881 CCZGGGGTTC ATGCACAGG7 ACGCGCTGC7 GACG7CGGTG GCACAGCCGA AGGGCAGGCC 

2 94 1 GGCGACGACC GCGCCGGCC7 GGAAGACGTC CGGA7AGGTG GCGAGCATCA CCGACG7CAT 
3001 GGCACCGCCG GCGGACAGCC CGGTGA7GTA GGTGCZZ7GG GGGTCCGCGC CGTAGGCGGA 

3 0 61 GACGGTGTGA GCGGCCATC7 GCCGGATCGA CGCGGCTTCG CCCTGGCCCC 7GCGGT7GTC 
3121 GC7GCTCTGG AACCAGT7GA AGCACC7GT7 CGCG77GT7C GACGACGTGG TCTCGGCGAA 

50 3181 CACGAGCAGG AAGCCA7AGC GG7CCGCGAA 7GAGAGCAGG CCGGAGTTGT CGGCGTAGCC 

32 41 C7GGGCGTCC TGGGTGCAAC CG7GCAGGGC GAACACCACC GCCGGC7CCG CGGGCAGGGA 
3301 CGCGGGCCGG TAGACG7ACA TGTTCAGCCG GCCCGGGTTC GTGCCGAAGT CCGCGACZ7Z 
3 3 61 GG7CAGG7CC GCCT7GG7CA GACCGGGC7T GGCCAGGCCC GCCGCGGCG7 GGGCCG7CGG 
34 21 CGZZGGGCCG AGCAGGGCZG CTCCGAG7AC GAGGGCCACG ACGGCCACGA GACGGG7GAG 

55 3481 CACCCCCCGC CGTCCCGGAC GCGACAACGA ZZCGAZCGGC GGCGAGGAGG AGAGGGGGAA 

3 541 CA.GCGGGGTG AGGATTCCCC GGAACGGCGG CGGC7GCATG GCGGCTCCCT CGATGTCG7G 
3 601 GG33GGACAC GGAGGGC7CC CTGACGTCGA 7CAG7GGGAG CGCCCCGG7G CCCGGCACCG 
2c6i TAGGGG7GG7 TCAACCCGCA ACGGTATGGC ZZGGAGZAZC ACACCCCGCA CCGCGCGATG 
37 21 TGCGCCCGGA CGGA7TG7G7 CGCCTTGCGG AATC7GATAC CCGGACGCGA CGAACGCCCC 

60 2 731 ACCCGACACG GGTAGGGCG7 CA7GGTG7CC GAC7CGGCCG G7CGGCC77C CCTGCCCTGG 
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384 1 
2901 
3961 

•: 0 c 1 GCGGCG AA. ' 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



4 081 

14 1 

4201 
4 261 
4321 
.381 
4 4 4 1 
4 501 
4 561 
4621 
4681 
4741 
4801 
4861 
4921 
4981 
504 1 
5101 



ATGTCGGTG; 
TTGCCCCAGG 
GTCAGGAGCG 
TACACG7CGC 



5161 
5221 
52S1 
5341 
5401 
54 61 
5521 
5581 
564 1 
5701 
5761 
5821 
5881 
594 1 
6001 
6061 
6121 
6181 
62 4 1 
5301 
6361 
6421 
6481 
6541 
6601 
6661 
6721 
6731 
684 1 
6901 
6961 
7021 
7081 
7141 
7201 
'•"161 
7321 
7381 



?r^: GG3 GGTCGGCGGA GCGGGCGTCG GCGGGCTGGG CGGTATGGCG GCCGAGGACG 
CCAGC^CGT GGGGGGGCCG CGCCCAAGTG CAGTACGCCG ACCGTGGCCG GCGGGAGGGC 
CGGACCGG7C AGTGCAGTCC CGCGGCCCTG CGGGACCGCT CG7CCCAGAC GGGTTCCAC~ 
^^~ G ~'' ZZ - G GG7CCGTG TCCGCGGCGG TAGACCA7CA G7G7CCGCTC GAAGG7CA~ : 
ACGAioACAG CGTCCTGGTT GTAGCCGATG GTGCGCACGC 7GATGATGCC 7ACG7CAGG7 
C(jGCTGGCGG ACTCCCGGGT GTTCAGGACC TCGGAC7GCG AG7AGATGGT GTCGCC^CG 
AAGACCGGG7 TCGGCAGCC7 GACCCGG7CC CAGCCGAGG7 TGGCCATCAC ATGCTGGGAG 
A ^"~ CGC7C7GCCC GG7GACCAGG GCGAGGG7GA AGGTGGAG7C CACCAGCGGC 

7GG7GCCCGC CGAGTAGTGG CGG7CGAAG7 GCAGCGGCGC GG7G7^C"-C 
7GAGCGAGGA GTTG7CGG7C TGCAGGACCG 7GCGGCCCAG GGGG7GGCGZ 
CGGTGGTGAA GTCCTCGAAG TAGCGGCCC7 GCCAGGCC7C GACCACAGCG 
G7GCGGG7GG CGTCC7GGTC CGGGTTCTCA GTCGTCATGG CGCTCATTCT GGGAAGTC r C 
CGGTCCGC7G TGAAATGCCG AACCTTCACC GGGCTCATAC G7GCGGCGCA TGAGCCC7GG 
ACCGTACG7A GTCGTAGAAC CTCGCCACCA CTGGCGCGCG TGGTCCTCCG GCGAGTG7GA 
CCACGCCGAC CGTGCGCCGC GCCTGCGGGT CGTCGAGCGG CACGGCGACG GCGTGG7CAC 
CGGGCCCGGA CGGGCTGCCG GTGAGGGGGG CGACGGCCAC ACCGAGGCCG GCGGCGACCA 
GGGCCCGCAG CGTGCTCAGC TCGGTGCTCT CCAGGACGAC CCGCGGCACG AATCCGGCCG 
CGGCGCACAG CCGGTCGGTG ATCTGGCGCA GTCCGAAGAC CGGCTCCAGT GCCACGAACG 
CCTCATCGGC CAGCTCCGCG GTCCGCACCC GGCGGCGTCT GGCCAGCCGG 7GTCCGGG7G 
GGACGAGCAG GCACAGTGCC TCGTCCCGCA GTGGTG7CCA CTCCACATCG TCCGCGGCGG 
GTCGTGGGC7 GGTCAGCCCC AGGTCCAGCC TGCTGTTGCG GACGTCGTCG ACCACGGCGT 
CGGCGGCG7C GCCGCGCAGT TCGAAGGTGG TGCCGGGAGC CAGCCGGCGG 7ACCCGGCGA 
GGAGGTCGGG CACCAGCCAG GTGCCGTAGG AGTGCAGGAA ACCCAGTGCC ACGGTGCCGG 
TGTCGGGG7C GATCAGGGCG GTGATGCGCT GCTCGGCGCC GGAGACCTCA CTGATCGCGC 
GCAGGGCC7G GGCGCGGAAG ACCTCGCCGT ACTTGTTGAG CCGGAGCCGG TTCTGGTGCC 
GGTCGAACAG CGGCACGCCC ACTCGTCGCT CCAGCCGCCG GATGGCCCTG GACAGGGTCG 
GCTGGGAGA7 GTTGAGCCGT TCCGCGGTGA TCGTCACGTG C7CGTGCTCG GCCAAGGCCG 
TGAACCAC7G CAACTCCCGT ATCTCCATGC AGGGACTATA CG7ACCGGGC ATGGTCC7GG 
CGAGG77TCG TCATTTCACA GCGGCCGGGC GGCGGCCCAC AGTGAGTCCT CACCAACCAG 
GACCCCA7GG GAGGGACCCC ATGTCCGAGC CGCATCCTCG CCCTGAACAG GAACGCCCCG 
CCGCGCCCCT GTCCGGTCTG CTCGTGGTTT CTTTGGAGCA GGCCGTCGCC GCTCCGTTCG 
CCACCCGCCA CC7GGCGGAC CTGGGCGCCC GTGTCATCAA GATCGAACGC CCCGGCAGCG 
GCGACCTCGC CCGCGGCTAC GACCGCACGG TGCGTGGCAT GTCCAGCCAC TTCGTCTGGC 
TGAACCGGGG GAAGGAGAGC GTCCAGCTCG ATGTGCGCTC GCGGGAGGGC AACCGGCACC 
TGCACGCCTT GGTGGACCGG GCCGATGTCC TGGTGCAGAA TCTGGCACCC GGCGCCGCGG 
GCCGCCTGGC ATCGGCCACC AGGTCCTCGC GCGGAGCCAC CGAGGCTGAT CACCTGCGGA 
CATATCCGGC TACGGCAGTA CCGGCTGCTA CCGCGGACCG CAAGGCGTAC GACCTCC7GG 
TCCAGTGCGA AGCGGGGCTG GTCTCCATCA CCGGCACCCC CGAGACCCCG TCCAAGG7GG 
GCCTC7CCA7 CGCGGACATC TGTGCGGGGA TGTACGCGTA C7CCGGCA7C CTCACGGCCC 
TGCTGAAGCG GGCCCGCACC GGCCGGGGCT CGCAGTTGGA GG7CTCGATG CTCGAAGCCC 
TCGG7GAA7G GATGGGATAC GCCGAG.TACT ACACGCGCTA CGGCGGCACC GCTCCGGCCC 
GCGCCGGCGC CAGCCACGCG ACGATCGCCC CCTACGGCCC G7TCACCACG CGCGACGGGC 
AGACGATCAA TCTCGGGCTC CAGAACGAGC GGGAGTGGGC T7CCTTCTGC GGTGTCGTGC 
TACAACGCCC CGGTCTCTGC GACGACCCGC GCTTTTCCGG CAACGCCGAC CGGGTGGCGC 
ACCGCACCGA GC7CGACGCC CTGGTGAGCG AGGTGACGGG CACGCTCACC GGCGAGGAAC 
TGGTGGCGCG GCTGGAGGAG GCGTCGATCG CCTACGCACG 
TCAGCGAACA CCCCCAACTG CGTGACCGTG GACGCTGGGC 
GTGCGCTCGA GGGCCTGATC CCCCCGGTCA CCTTCCACGG 
GCCGGG7CCC GGAGC7GGGC GAGCA7ACCG AGTCCGTCCT 

ACAGCGCCGA CCGCGAAGAG GCCGGCCATG CCGAATGAAC TCACCGGAGT CCTGATCC7G 
GCCGCCGTGT 7CCTGCTCGC CGGCGTACGG GGGCTGAACA 7GGGCCTGCT CCCGCTGGTC 
TGCTCGGGGT GGTCGCACTC GACCGAACGC CGGACGAGGT GCTGGCGGG7 
GCA7G7TCCT GGTGCTGGTC GCCGTCACGT TCCTCTTCGG GATCGCCCGC 
G7CAACGGCA CGGTGGACTG GCTGGTACGT GTCGCGGTGC GGGCGGTGGG GGCCCGGGTG 
GGAGCCGTCC CCTGGGTGCT CTTCGGCCTG GCGGCACTGC TC7GCGCGAC AGGCGCGGCC 
TCGCCCGCGG CGGTGGCGAT CGTGGCGCCG ATCAGCG7CG CGTTCGCCGT CAGGCACCGC 
7G7ACGCCGG ACTGATGGCG GTGAACGGGG CCGCAGCCGG CA.GT77CGCC 
TCC7GGGCGG CATCG7CCAC 7CGGCGC7GG AGAAGAACCA 7C7GCCCG7C 
TGCTCTTCGC AGGCACCTTC GCCTTCAACC TGGCGGTCGC CGCGGTGTCA 
TGGCTCGTCC TCGGGCGCAG GCGCCTCGAA CCACATGACC 7GGACGAGGA CACCGA7CCC 
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74 4 1 ACGGAAGGGG ACCCGGCTTC CCGCCCCGGC GCGGAACACG TGATGACG-~7 GAC^GCGA^ 
7 501 GCCGCGCTGG TGCTGGGAAC CACGGTCCTC TCCCTGGACA CCGGCTTCCT GGCCCTCA" 
7 5 61 7TGGCGGCG7 TGCTGGCGCT GCTCTTCCCG CGCACCTCCC AGCAGGCCAC CAAGGAGA^ 
_^Z l TGGTGCTGCT GGTA7GCGGG A7CG7GACC7 ACGTCGCCCT GCTCCAGGA 1 

- ;ocl -TGGGCATCG TGGACTCCCT GGGGAAGATG ATCGCGGCGA TCGGCACCGC GC7G- — CG^ 

774 1 GCCCTGG7GA TCTGCTACGT GGGCGGTG7C GTCTCGGCC7 TCGCCTCGAC CACCGGGATr 
■•601 C7CGG7GCCC TGATGCCGCT GTCCGAGCCG TTGCTGAAGT • CGGGTGCGAT CGGGACGAC Z 
78 61 GGCA7G37GA TGGCCCTGGC GGCCGCGGCG ACCGTGGTGG ACGCGAG7CC CT^C7<~CA^~ 

7 921 AATGG7GC7C TGGTGGTGGC C AACGC7 CCC GAGCGGCTGC GGCCCGGCG7 GTACCAGGG^ 
1U 7 931 . 7GC7G7GG7 GGGGCGCCGG GGTG7GCGCA CTGGCTCCCG CGGGCGCZ7G GGCGCCCTT 1 

804 1 G7GG7GGCG7 GAGCGCAGCG GAGCGGGAAT CCCCTGGAGC CCGTTTCCCG TGCTGTGT^G 
8101 C7GACG7AGC GTCAAGTCCA CGTGCCGGGC GGGCAGTACG CCTAGCATGT CGGGCA7GG~ 
8161 7AATCAGATA ACCCTGTCCG ACACGCTGCT CGCTTACGTA CGGAAGGTGT CCCTGCGCGA 
8221 TGACGAGGTG CTGAGCCGGC TGCGCGCGCA GACGGCCGAG CTGCCGGGCG GTGGCGTAC~ 
ID 8281 GCCGG7GCAG GCCGAGGAGG GACAGTTCCT CGAGTTCCTG GTGCGGTTGA CCGGCGCGCG 

8 34 1 TCAGGTGCTG GAGATCGGGA CGTACACCGG CTACAGCACG CTCTGCCTGG CCCGCGGATT 
84 01 GGCGCCCGGG GGCCGTGTGG TGACGTGCGA TGTCATGCCG AAGTGGCCCG AGGTGGGCG£ 
8 4 61 GCGGTACTGG GAGGAGGCCG GGGTTGCCGA CCGGATCGAC GTCCGGATCG GCGACGCCCG 
8 521 GACCGTCCTC ACCGGGCTGC TCGACGAGGC GGGCGCGGGG CCGGAGTCGT TCGACATGGT 

20 8 581 GTTCATCGAC GCCGACAAGG CCGGCTACCC CGCCTACTAC GAGGCGGCGC TGCCGCTGGT 
8 64 1 ACGCCGCGGC GGGCTGATCG TCGTCGACAA CACGCTGTTC TTCGGCCGGG TGGCCGACGA 
87 01 AGCGGTGCAG GAGCCGGACA CGGTCGCGGT ACGCGAACTC AACGCGGCAC TGCGCGACGA 
87 61 CGACCGGGTG GACCTGGCGA TGCTGACGAC GGCCGACGGC GTCACCCTGC TGCGGAAACG 
8 821 G7GACCGGGG CGATGTCGGC GGCGG7CAGC GTCAGCGTCG TCGGCGCGGG CCTCGCGGAG 

25 8381 GGC7CCAGAT GCAGGCGTTC GACGCCGGCG GCGGAAGCGC CCGCCACCTC GGACACGCAG 

8 94 1 GGGCAGTCGG AGTCCGCGAA GCCCGCGAA.C CGGTAGGCGA TCTCCATCAT GCGGTTGCGG 
9001 7CCGTACGCC GGAAGTCCGC CACCAGGTGC GCCCCCGCGC GGGCGCCC7G GTCCGTGAGC 
90 61 CAGTTCAGGA TCGTCGCACC GGCACCGAAC GACACGACCC GGCAGGACGT GGCGAGCAG7 
9121 TTCAGGTGCC ACGTCGACGG CTTCTTCTCC AGCAGGATGA TGCCGACGGC GCCG7GCGGG 

30 9181 CCGAAGCGG7 CGCCCATGGT GACGACGAGG ACCTCATGGG CGGGATCGGT GAGCACGCGC 

92 4 1 GCAGGTCGGC GTCGGAGTAG TGCACGCCGG TCGCGTTCAT CTGGCTGGTC CGCAGCGTCA 
9301 GTTCCTCGAC GCGGCTGAGT TCCTCCTCCC CCGCGGGTGC GATCGTCATG GAGAGGTCGA 

93 61 GCGAGCGCAG GAAGTCCTCG TCGGGACCGG AGTACGCCTC CCGGGCCTGG TCGCGCGCGA 

94 21 AACCCGCCTG GTACATCAGG CGGCGCCGAC GCGAGTCGAC CGTGGACACC GGCGGGCTGA 
35 94 81. ACTCCGGCAG CGACAGGAGC GTGGCCGCCT GCTCGGCCGG GTAGCACCGC ACCTCGGGCA 

954 1 GGTGGAACGC CACCTCGGCA CGCTCGGCGG GCTGGTCGTC GATGAACGCG ATCGTGGTCG 
9601 GTGCGAAGTT CAGCTCCGTG GCGATCTCGC GGACGGACTG CGACTTCGGC CCCCATCCGA 
9661 TGCGGGCCAG CACGAAGTAC TCCGCCACAC CGAGGCGTTC CAGACGC7CC CACGCGAGG7 

97 21 CGTGGTCG7T CTTGCTCGCC ACCGCCTGGA GGATGCCGCG GTCGTCGAGC GTGG7GATCA 
40 97 81 CCTCGCGGA7 CTCGTCGGTG AGGACCACCT CGTCGTCCTC CAGCACGGTG CCCCGCCACA 

98 4 1 AGGTGTTGTC CAGG7CCCAG ACCAGACACT TGACAATGGT CATGGCTGTC CTC7CAAGCC 
9901 GGGAGCGCCA GCGCGTGCTG GGCCAGCATC ACCCGGCACA TCTCGCTGCT GCCC7CGA7G 
9961 ATCTCCATGA GCTTGGCGTC GCGGTACGCC CGTTCGACGA CGTGTCCC7C 7C7CGCGCC7 

10021 GCCGACGCGA GCACCTGTGC GGCGGTCGCG GCCCCGGCGG CGGCTCG7TC GGCGGCGACG 

45 10081 7GCTTGGCCA GGATCGTCGC GGGCACCATC TCGGGCGAGC CCTCGTCCCA GTGG7CGCTG 
10141 GCGTAC7CGC ACACGCGGGC CGCGATCTGC 7CCGCGG7GC ACAGG7CGGC GA7G7GCCGG 
10201 GCGACGAG77 GGTGGTCGCC GAGCGGCCGG CCGAACTGCT CCCGGGTCCG GGCG7GGGCZ 
10261 ACCGCGGCGG TGCGGCAGGC CCGCAGGA7C CCGACGCAGC CCCAGGCGAC CGAC77GCGC 
10321 CGGTAGGCGA GTGACGCCGC GACCAGCATC GGCAGTGACG CGCCGGAGCC GGCCAGGACC 

50 10381 GCGCCGGCCG GCACACGCAC CTGGTCCAGG TGCAGATCGG CGTGGCCGGC GGCGCGGCAG 
1044 1 CCGGACGGCT 7CGGGACGCG CTCGACGCGT ACGCCGGGGG TGTCGGCGGG CACGACCACC 
10 501 ACCGCACCGG AACCATCCTC CTGGAGACCG AAGACGACCA GGTGGTCCGC GTAGGCGGCG 
10561 GCAGTCG7CC AGACCTTGTG GCCGTCGACG ACAGCGGTGT CCCCGTCGAG CCGAACCCGC 
10621 GTCCGCATCG CCGACAGATC GCTGCCCGCC TGCCGCTCAC TGAAGCCGAC GGCCGCGAG7 

55 10681 7TCCCGC7GG TCAGCTCCTT CAGGAAGGTC GCCCGCTGAC CGGCG7CGCC GAGCCGC7GC 
1074 1 ACGG7CCACG CGGCCATGCC CTGCGACGTC ATGACACTGC GCAGCGAACT GCAGAGGCTG 
10801 CCGACGTGTG CGGTGAACTC GCCGT7CTCC CGGCTGCCGA GTCCCAGACC GCCG~GC7" 
- vj w b 1 ^»uCGCCAC77 CCoCGCAGAG CAGGCCG7CG GCGCCGAGCC GGACGAGCAG GTCGCGCGGC 
10 921 AGTTCGCCGG ACGTGTCCCA CTCGGCGGCC CGGTCACCGA CAAGG7CGG7 CAGCAGCGCG 

60 10981 TCACGCTCAG GCATCGACGG CCCGCAGCCG GTGGACGAGT GCGACCA7GG ACTCGACGGT 
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GCGACCAACG CGTGCG^m-. 
TGGGTC CGG7CA7GA3 AACACC77C7 CGTATTCG7A 3AAG — C^" 
s-CGCTCTTCC GGCCGTGGTG 7CCC7CGCGG ACCTTGCCCA GCAGCAGGTC ACAGGGGCGG 

CT jCGCTCGT cgggggtggg tttgtgcagg acccacagcg cgtcgacgag gt~ct-gatg 

CCGATCAGGT CCGCGGTGCG CAGCGGCC73 GTCGGATGGG CGAGGCACCC CCT^ATGAGG 
GCGTGGACGT CCTCGACGGA CGCGG7GCCG tcctgcagga tccgcgccgc gtggttgatc 




TACCGATCGC GGGGAGCGCG 
-.S8i GAGGTGGCCG TCCGCAGCAC ACCGGGGTCG GCCTCGGCGG GCCCGGCCAC GAGT7G7GCC 
11941 GTCCGCAGTT CGGTGGCGAT CCGCGCCCGC GCCGCCGTAA GGATCTCCTC GGACGTGTCG 
12001 ACGAGTGTCA CCGGGACGCC GTGGCGCAGC GCGAGCGTGG TGATGCCGGT GCCCATCACT 
12061 CCCGCGCCGA GCACGATCAG CTGGTGGTCC ACGCTGTTTC CTCCCTCCGG GGTCACCATG 
12121 GCAGCGAGTA CGGGTCGAGG ACGTCTTCGG GGGTCGACCC GATCGCGTCC TTGCGGCCGA 
12181 GGCCGAGTTC GTCGGCGAAG CCGAGCAGCA CGTCGAACGC GATGTGGTCG GCGAACGCGC 
1224 1 TGCCCGTCGA GTCGAGGACG C7CAGGC7G7 CCCGGTGGTC CGCCGCGGTG TCCGGTGCCG 
12 301 CGCACAGGGC CGCCAGCGAC GGGCCGAGC7 CGCGG7CCGG CAG77GC7GG 7AC7CGCCC7 
12 361 GGGCGCGGGC CTGCCCCGGA 7GG7CGACGC AGA7GAACGC G7CG7CGAGC AGGG7C7TCG 
124 21 GCAG7-TCGGT C77GCCCGGC TCG7CGGZGC CGATGGCG77 CACATGCAGG TGCGGGAGGC 
124 31 GCGGC7CGGC GGGCAGCACC GGCCC777GC CCGAGGGCAC CGAGG7GACG G7GGACAGGA 
1254 1 CA7CCGCGGC GGZGGCGGCZ TCCGCC3GAT CGG7CACC77 GACCGGCAG7 CCGAGGAACG 
12 601 CGA7GCGG7C CGCGAACGAC GGCGCG7GGC CGGGGTCGGT G7CGC7GACC AGGA7CCGCT 
12 661 CGATGGGCAG GACCC7GC7G AGCGCG7GCG CC7GGG7CAC CGCC7G7GCG CGCGCGCCGA 
127 21 TCAGCG7GAG CG7GGCGC7G TCGGACCGGG CGAGCAGCCG GC7CGCGACG GCGGCGACCG 
12781 CGCCGG7CCG CA7CGCGG7G ATCAC3CC7G CG7CGGCGAG GGCGG7CAGA C7GCCGC7G7 
1284 1 CG7CG7CGAG GCGCGACA7C G7GCCGACGA TGG7CGGCAG CCGGAAGCGC GGA7AG77GT 
12 901 GCGGAC7G7A CGAAACCG7C 77CA7GG7CA CGCCGACACC GGGGACCCGG 7ACGGCA7GA 

12 961 AC7CGATGAC GCCGGGAA7G 7CGCCGCCGC GGACGAATCC GG7ACGCGGC GGCGCC7CGG 
13021 CGAACTCGCC GCGGCCGAGC GCGGCGAACC CG7CGTGCAG CTCGC7GA7C AGCCGGTCCA 
13081 7CA7CACG7C GCGGCCGATC ACGGAGAGAA 7CCGC7TGAT GTCACG77GG CGCAGGACCC 
1314 1 7GG7C7GCAT GTGTCACC7C CC777CG7GG CCGGAGC7G7 CTTGG7GG7G CCGGTGGGGG 
13201 CGGC77CCG7 7C7CA7CGCA GC7CCC7G7C GA7GAGG7CG AAAA7CTCG7 CCGCGG7CGC 
13261 G7CCGCGGAC AGCACGCCGG CCGGCG7GG7 CGGGCGGG7C 7CCCGCCGCC AGCGG77GAG 

13 321 CAGGGCGTCC AGCCGGG77C CGATCGCG7C CGCC7GGCGG GCGCCCGGG7 CGACACCGGC 
12381 AACGAG7GCT TCCAGCCGGT CGAGC7GCGC GAGCACCACG GTCACCGGG7 CSTCCGGGGA 
13441 CAGCAG77CA CCGATGCGG7 CGGCGAG7GC GCGCGGCGAC GGG7AGTCGA AGACGAGCG7 
13 501 GGCGGACAGT CGCAGACCGG TCGCC7CGT7 GAGGCCG7TG CGCAGCTGCA CCGCGA7GAG 
13561 CGAGTCCACA CCGAG77CCC GGAACGCCGC G7CC7CCGGG ATG7CCTCCG GG7CGGCGTC 
13 621 CCCCAGGACG GCCGC7GCC7 TC7GCCGGAC GAGGGCGAGC AGG7CGG7GG GGCGT7CC7G 
13 681 C7CG77GCGG GCGC7CCGGC GGGCZGACGG C77GGGCCGG CCACGCAGCA GCGGGAGG7C 
13741 CGGCGGCAGG 7CGCCCGCCA CGGCGACGAC AC7GCCCG7T CCGG7G7GGA CGGGGGCG7C 
13801 G7ACA7GCGC ATGCCCTGT7 GGGCGG7GAG CGCGC7CGGG CCACCC77GC GCA7ACGGCG 
13861 CCGGTCGGCG 7CGG7CAGGT CCGCGGTCAG GCGAC7CGCC 7GG7CCCACA GGCCCCAGGC 

13 921 GA7CGACAGC CC7GGCAGCC C7TG7GCA2G CCGG7G77CG GCGAGCGCG7 CGAGGAACGC 
13981 G77CGCCGCC GCGTAGTTGC CC7GACCGGG GG7GCCCAGC ACACCGGCCG CCGACGAGTA • 
1404 1 GACGACGAA7 GCGGCGAGGT CGGTG7CGC3 GG7GAGCCGG TGCAGGTGGC AGGCGGGGTC 

14 101 GGCCTTGGGT T7GAGGACGG TGTCGA7GCG G7CGGGGG7G AGGT7GTCGA GCAGGGCGTC 
14 161 G7CGAGGG77 CCGGCGGTG7 GGAAGACGGC GG7GAGGGG7 7GAGGGA7G7 GGGCGAGGGT 
14 221 GG7GGCGAG7 7GG7GGGGG7 CGCCGACG7C GCAGGGGAGG 7GGG7GCCGG GGG7GG7G7C 
14 281 GGGGGGTGGG G7GCGGGAGA GGAGG7AGG7 G7GGGGG7GG T7CAGG7GGC GGGCGAGGA7 
14 341 GGCGGCGAGG G7GCCGGAGC CGCCGG7GA7 GACGACGGCC CCC7CGGGGT CCAGCGGCCG 
14 401 CGGGACCG7G AGGACGATC7 TGCCGG7G7G CTCGCCGCGG C7CA7GG7CG ZZAGZGZZ7C 
14 4 61 GCGGACCTGC CGCA7G7CG7 GCACCG7CAC CGGCAGCGGG 7GCAGCACAC CGCGCGCGAA 
14 521 CAGGCCGAGC AGC7CCGCGA 7GA7C7CC77 GAGCCGG7CG GGCCCCGCG7 CCA7CAGG7C 
14 531 GAACGG7CGC TGGACGGCG7 GCCGGA7G7G CG7C77CCCC A7CTCGA7GA ACCGGCCACC 
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14 941 GG7CATCACG GACGCCGCCT GCGGGAACGT CCAGCCGTCC GGCATCCGGC C^AGC=l 
: - 001 G7GGTCGGCC ATGACGGTGG GGCCGAAGCC GGTGCCGACG AGGGCGAAGA CGCGG7CGCC 
1:061 CGG7GCCAGA CCGGAGACGT CGGCGCCGGT CTCCAGGACG ATGCCCGCGG CCTCGCCG^r 
15121 G ^^ CA CGCCG TGACGGGGGT AGGTGCCGAG CGCGATGAGC ACATCGCGGA AGTTGAGCCC 

10 15181 CG7CGCACGC ACACCGATGG GGACCTCGGC ZGGGGCZAGG GGGCGCCGGG GCTCCGC~GA 
1524 1 G77GGCCGCG GTGAGGCCGT CGAGGGTGCC CGTCCGCGCC GGCCGGATCA GCCACG^GTr 

15 301 GC7GTCCGGC ACGGTGAGCG GCTCCGGCAC CCGGGTGAGG CGGGCCGCCT ZZAACZZZr" 
15361 GCCGCGCAGC CGCAGACGCG GCTCGCCGAG TGCGAGGGCG ATGCGCTGCT ZCTCGGGGGr 
15421 GAGCGTGAGG CGGGACTCGG TCTCGACGTG GACGAACCGG CCGGGCTGCT CGGCC7GGGC 

15 15481 GZZZCGCAGC AGTCGGGCCG CCGCGCCGGT GGCGAGGCCC GCGGTGGTGT GCACGAGCAG 
15541 A7CCCCGCCG GAGCCGGTCA GGGCGGTCAG CAGCCGGGTG GTGAGCGCAC GGGTCTCGGC 
15601 CACCGGGTCG TCGCCATCAG CGGCAGGCAA CGTGATGACG TCCACGTCGG TCGCGGGGAC 
15661 A7CCGTGGGT GCGGCGACCT CGATCCAGGT GAGACGCATC AGGCCGGTGC CGACGGGTGG 
15721 GGACAGCGGG CGGGTGCGGA CCGTCCGGAT CTCGGCGACG AGTTGGCCGG CGGAGTCGGC 
20 15781 GACGCGCAGA CTCAGCTCGT CGCCGTCACG AGTGATCACG GCTCGGAGCA TGGCCGAGCC 
15841 CGTGGCGACG AACCGGGCCC CCTTCCAGGC GAACGGCAGA CCCGCAGCGC TGTCGTCCGG 
15901 CGTGGTGAGG GCGACGGCGT GCAGGGCCGC GTCGAGCAGC GCCGGATGCA CACCGAAACC 
15961 G7CCGCC7CG GCGGCCTGCT CGTCGGGCAG CGCCACCTCG GCATACACGG 7GTCACCA7C 
16021 ACGCCAGGCA GCCCGCAACC CCTGGAACGC CGACCCGTAC TCATAACCGG CATCCCGCAG 
25 16081 TTCGTCATAG AACCCCGAGA CGTCGACGGC CACGGCCGTG ACCGGCGGCC ACTGCGAGAA 
16141 CGGCTCCACA CCGACAACAC CGGGGGTGTC GGGGGTGTCG GGGGTCAGGG TGCCGCTGGC 
16201 G7GCCGGGTC CAGCTGCCCG TGCCCTCGGT ACGCGCGTGG ACGGTCACCG GCCGCCGTCC 
16261 GGCCTCA7CA GCCCCTTCCA CGGTCACCGA CACATCCACC GCTGCGGTCA CCGGCACCAC 
16321 AAGGGGGGAT TCGATGACCA GCTCGTCCAC TATCCCGCAA CCGGTCTCG7 CACjCGGCCCG 
30 16381 GA7GACCAGC 7CCACAAACG CCG7ACCCGG CAGCAGGACC G7GCCCCGCA CCGCGTGA7C 
1 644 1 AGCCAGCCAG GGGTGAGTGC GCAATGAGAT CCGGCCAGTG AGAACAACAC CACCA7CG7C 
16501 GGCGGGCAGC GCTG7GACAG CGGCCAGCA7 CGGATGCGCC GCACCCGTCA ACCCCGCCGC 
16561 CGACAGA7CG GTGGCACCGG CCGCCTCCAG CCAG7ACCGC C7G7GCTCGA ACGCG7ACG7 
. ^ 16621 GGGCAGATCC AGCAGCCG7C CCGGCACCGG TTCGACCACC GTGTCCCAGT CCACTGCCG7 

35 16681 GCCCAGGG7C CACGCCTGCG CCAACGCCG7 CAGCCACCGC TCCCAGCCGC CG7CACCGG7 
16741 CCGCAACGAC GCCACCG7GT GAGCCTGCTC CATCGCCGGC AGCAGCACCG GA7GGGCACT 
16801 GCACTCCACG AACACCGACC CATCCAGCTC CGCCACCGCC GCG7CCAACG CCACCGGACG 
16861 ACGCAGA7TC CGGTACCAGT ACCCCTCATC CACCGGCTCC GTCACCCAGG CGCTG7CCAC 
16921 GG7CGACCAC CACGCCACCG ACGCGGCCTT CCCTGCCACC CCC7CCAGTA CC77GGCCAG 
40 16981 T77ATCCTCG ATGGCTTCCA CGTGGGGCG7 GTGGGAGGCG TAGTCGACCG CGATACGACG 
1704 1 CACCCGCACG CCTTCGGCCT CATACCGCGC CACCACCTCC 7CCACCGCCG ACGGG7CCCC 
17101 CGCCACCACC GTCGAAGCCG GGCCGTTACG CGCCGCGATC CACACACCC7 CGACCAGACC 
17161 GACCTCACCG GCCGGCAACG CCACCGAAGC CATCGCTCCC CGCCCGGCCA G7CGCGCCGC 
17 221 GATGACCTGA C7GCGCAATG CCACCACGCG GGCGGCGTCC TCGAGGCTGA GGGCTCCGGC 
45 17 281 CACGCACGGC GCCGCGATCT CGCCCTGGGA G7GTCCGATC ACCGCGTCCG GGACGACCCC 
17 341 A7GCGCC7GC CACAGCGCGG CCAGGC7CAC CGCGACCGCC CAGCTGGCCG GG7GGACCAC 
17401 C7CCAGCCGC 7CCGCCACAT CCGGCCGCCC CAACATCTCC CGCACATCCC AGCCCG7G7G 
174 61 CCGCAGCAAC GCCTGAGCGC ACTCCTCCAT ACGCGCGGCG AACACCGCGG AGTGGGCCAT 
17 521 GAG7TCGACG CCCA7GCCGA CCCACTGGGC GCCCTGGCCG GGGAAGACGA ACACCG7ACG 
50 1~561 ZZGCTGGTCZ ACCGCCACAC CCGTCACCCG GGCATCGCCC AGCAGCACCG CACGG7GACC 
17 641 GAAGACAGCA CGCTCCCGCA CCAACCCC7G CGCGACCGCC GCCACA7CCA CACCACCCCC 
'.7701 G 7 GC AG A 7 AC CCC7CCAGCC GC7CCACCTG CCCCCGCAGA C7CACCTCAC CACGAGCCGA 
l"t-l 7A7CGGCAAC GGCACCAACC CG7CAACAAC CGAC7CCCCA CGZGACGZZZ CAGGAACACC 
17S21 C77AAGGA7C ACG7GCGCG7 TCG7ACCGC7 CACCCCGAAC GACGACACAC CCGCA7GCCG 
55. \ n 32l 7GCCCGA7CC GACTCGGGCC ACGGCCTCGC CTCGGTGAGC AGC7CCACCG CACCCGCCGA 
17 941 CCAGTCCACA 7GCGACGACG GCTCGTCCAC ATGCAGCGTC TTCGGCGCGA TCCCG7ACCG 
180C1 CATCGCCATG ACCA7C77GA TCACACCGGC GACACCZZCC GCZGZCTGZZ C AT G AC 7 GAT 
13061 G77CGACT7C AACGAACCCA GCAGCAGCGG AACCTCACGC TCC7GCCCG7 ACG7CGCCAG 
18121 AA7GGCCTGC GCCTCGATGG GATCGCCCAG CGTCGTCCCC GTCCCGTGCG /CC7CCACCAC 
60 13181 G7GCACATCG GGGGCGCGCA GTCCGGCGTT CACCAACGCC TGCTGGATGA CACGC7GCTG 
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1824 1 GGACGGZZZG TTGGGGGCGG ACAGCCCGTT GGAGGCACZG TCCTGGTTCA CCGC^GAC^C 
i3301 GCGGACGACC GCGAGAACGG TGTGTCCGTT GCGCTCGGCG TCGGAGAGCC GC7CCAGCAC 
13361 AAGAACGCCG GCGCCCTCCG CCCAGCCGGT GCCGT7GGCG GCGTCCGCGA ACG~G<~GGCP 
.84 21 GZGGZZZ. ZZ GGGGAGAGTC CGCCCTGC7G CTGGAATTCC ACGAACCCGG 7CGGGG7C-" 
' CATGACGGTG ACACCGCCGA CCAGCGCGAG CGAGCACTCC CCGTGGCGCA GTGCGTGCCC 

18 54 1 GGCCTGG.ZC AGCGCGACCA GCGACGACGA GCACGCCGTG TCCACCGTGA ACGCZGG^r- 
13 601 C7GGAGCCCA TAGAAGTACG AGATCCGGCC GGTGAGCACg' CTGGGCTGCA TGCCGAT^GP 
-c66i GCCGAAZZZZ TCCAGGTCCG CGCCGACGCC GTACCCGTAC GAGAAGGCGC CCA7GAACAC 
1 3721 GCCGG7 - . CG C7GCCGCGCA G7G7GCCCGG CACGA7GCCC GCGC7C7CGA ACGCCT-^CGA 
10 .=781 7G7CG777C7 AGGAGGA7CC GCTGC7GGGG G7CCATGGCC CG7GCCTCAC GGGGGC7GA7 
18841 GCCGAAGAAC GCGGGA7CGA AGCCGGCGGC G7CGGAGAGG AAGCCGCCGC GG7CCG7G7C 
13 901 CGATCCGZZG GTGAGGCCGG ACGGGTCCCA GCCACGGTCG GCCGGGAAGC GGG7GACCGC 
13 961 GTCGCCGGGA C7GTCCACCA 7GCGCCACAG GTCG7CGGGC GAGG7GACGC CGCCCGGCAG 
19021 7CGGCAGGGC A7GCCCACGA TGGZCAGGGG 7TCG7GACGG GTCGCGGCGG CTGTGGGKAC 
15 19081 AGCGACCGG7 GCGGGAGCAC CGACCAGAGC C7CG7CCAAC CGCGAGGGGA 7GGCCCGCGG 
19141 CG7CGGG7AG 7CGAAGACAA GCGTGGCGGG CAG7CGGACA CCGGTCGCCG CGGCGAC7CG 
19201 G77CCGCAG7 TGGACGGCGG 7CAGCGAGTC GATACCCAG7 7CCTTGAAGG CCGCGTCCGC 
19261 GGACACG7GC GCGGCGTCCG CGTGGCCGAG CACCGCCGCC GCG7TGTCGC GGACCAG7GC 
19321 CAGCAGCGGG G7G7CCCGC7 CAGCGCCGGA CATGGTGCCG AGCCGGTCGG CGAGCGGAAC 
20 19381 GGCGGTGGCC GCCGCCGGGC GCGATACGGC GCGGCGCAGA TCGGCGAAAA GCGGCGA7G7 
19441 GTGGGCGG7G AGGTCCATCG 7GGCCGCCAC GGCGAACGCG G7GCCGGTTC CGGCCGCGGC 
19501 77CCAGCAGG CGCA7GCCCA CACCGGCCGA CA7GGGGCGG AAACCGCCGG GGCGGACACG 
19561 GG7GCGG77G GTGCCGCTCA TGCTGCCGGT GAG7CCGC7G TCA7CGGCCC AGAGGCCCCA 
19621 GGCCAGCGAC AGCGCGGGCA GTCCTTCGGC ATGGCGCAGC GTCGCGAGTC CG7CGAGGAA 
25 19681 CCCGTTCGCC GCCGAGTAG7 TGCCCTGGCC GCGGCCGCCC A7GATGCCCG CGACGGACGA 
1974 1 GTAGAGGACG AACGAGCGCA GGTCCGCG7C CCGGGTCAGC 7CGTGCAGG7 GCCAGGCGCC 
19801 GTCGGCTTTG GGGCGCAGTG TGGTGGCGAG CCGCTCCGGG GTGAG7GCCG 7GG7CACGCC 
19861 GTCGTCGAGC ACGGCTGCCG TGTGGAAGAC CGCCGTGAGC GGCC7GCCGG CGGCGGCGAG 
19921 CGCGGCGGCG AGCTGGTCCC GGTCGGCGAC G7CACAGCGG ATGTGGACAC CGGGAG7G7C 
30 19981 GGCCGGCGGT 7CGC7GCGCG ACAGCAACAG GAGG7GGCGG GCGCCATGC7 CGGCGACGAG 
2004 1 A7GCCGGGCG AGGAGACC7G CCAGCACACC CGAGCCGCCG G7GA7GACCA CCG7GCCG7C 
20101 CGGG7CGAGC AGCGGT7CGG GCGTTTCCGC GGCGGCCGTG CGGG7GAACC GCGGCGCT7C 
20161 G7ACCGGCCG TCGGTGACGC GGACG7ACGG C7CGGCCAGT G7CG7GGCGG CGGCCAGCGC 
20221 CTCGA7GGGG G7G7CGG7GC CGG7C7CCAC CAGCACGAAC CGGCCCGGGT GC7CGGCC7G 
35 2C281 GGCGGACC3G ACGAGGCCGG CGACCGC7CC TCCGACCGG7 CCCGCGTCGA TCCGGACGAC 
20 34 1 GAGGGTGG7C 7CCGCAGGGC CGTCCTCGGC GA7CACCCGG 7GCAGC7CGC CGAGCACGAA 
204 01 C7CGGTGAGC CGGTACGTCT CGTCGAGGAC ATCCGCGCCC GG7TCCGGGA GCGCGGAGAC 
204 61 GA7G7GGACC GCGTCCGCAG GACCGGGCCC GGGAG7GGGC AGCTCGGTCC AGGAGAGGCC 
20 521 G7ACAAGGAG TTCCGTACGA CGGCGGCG7C GCCCTCGACG TTCACCGG7C GCGCGG7CAG 
40 2 0 581 GGCGGCGACG GTCACCACCG G7TGGCCGAC CGGGTCCGTC GCATGCACGG CAGCGCCGTC 
20641 CGGGCCCTGA GTGATCG7GA CGCGCAGCGT GG7GGCCCCG G7CG7G7GGA ACCGCACGCC 
207 01 GCTCCACGAG AACGGCAGCC GCACCTCCGC TTCCTGTTCC GCGAGCAGCG GCAGGCAGG7 
207 61 GACGTGCAAG GCCGCGTCGA ACAGCGCCGG G7GGACGCCA 7AGTGCGGCG 7CTCG7CCGC 
20821 CTGTTCCCCG GCGATCTCCA CCTCGGCGTA CAGGGTTTCG CCGTCGCGCC AGGCGGTGCG 
45 2 0881 CAGTCCCTGG AACGC7GGGC CGTAGCTGTA GCGGGTCTCG GCCAGCCGCT CG7AGAACGC 
2094 1 GCTCACG7CG ACGCG7CGCG CGCCCGGCGG CGGCCACGCG GGCGGCGGGA CCGCCGGGAC 
21001 GCTTCCGGZC CGGCCGAGGG 7GCCGC7GGC GTGCCGGG7C CAGC7G7CCG 7GCCC7CGG7 
210 61 ACGCGCG7GG ACGG7CAC7C GCCGCGG7CC GGCC7CA7CG GCCCC77CGA CGG7CACCGA 
21121 CACATCCACC GCGCCGG7CA CCGGCACCAC GAGCGGGGTC 7CGA7GACCA G77CA7CCAC 
50 21181 CACCCCGCAA CCGGTCTCGT CACCGGCCCG GA7GACCAGC TCCACAAACG CCG7ACCCGG 
21241 CAGCAGAACC GTGCCCCGCA CCGCGTGA7C AGCCAGCCAG GGATGCGTAC GCAACGAGAT 
21301 CCGGCCAG7G AGAACAACAC CACCACCGTC GTCGGCGGGC AG7GC7G7GA CGGCGGCGAG 
2 1361 CATCGGA7GC GCCGCCCCCG 7CAGCCCGGC CGCGGACAGA TCGG7GGCAC CGGCCCCCTC 
214 21 C AG CC AG 7 AC CGCC7G7GCT CGAACGCG7A GG7GGGCAGA 7CGAGCAGCC GTCCCGGCAC 
55 214 81 CGC77CGACC ACCG7GTCCC AG7CCACTGC CG7GCCCAGG GTCCACGCC7 GCGCCAACGC 
21541 CGTCAGCCAC CGCTCCCAGC CGCCG7CACC GG7CCGCAAC GACGCCACCG 7G7GAGCC7G 
21601 77CCA7CC2C G G C AG C AG C A CCGGATGGGC GC7GCAC7CC ACGAACACGG ACCCC7CCAG 
- 1 6 c^l C7CCGCCACC GCCGCG7CCA GCGCGACGGG GCGACGCAGG 77CCGG7ACC AG7AGCCC7C 
217 21 A7CCACCGGC 7CGG7CACCC AGGCGCTGTC CACCG7GGAC CACCAGGCCA CCGACCCGG7 
.60 21781 CCCGCCGGAA A7CCCCTCCA G7ACC7CGGC CAACTCG7CC 7CGA7GGC77 CCACGTGGGG 
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2184 1 CG7GTGGGAG GCGTAG7CGA CCGCGA7ACG GCGCACTCGC ACGCCTTCGG CCTCGTAC~ 
^1901 CGTCACCACT TCTTCCACCG CGGACGGGTC CCCCGCCACC ACAGTCGAAG ACGGGCCGT* 
219bl ACGCGCCGCG ATCCACACGC CCTCGACCAG GTCCACCTCA CCGGCCGGCA ACGCCACCGA 
22C22 AGCCATCGCC CCCCGCCCGG CCAGCCGZZC GGCGA7CACC TGGCTGCGCA AGGCZZC~" 
.'2031 ^CGGGGGGCG TCCTCAAGGC TGAGGGC7CC GGCCACACAC GCCGCCGCGA TCTCGC^CT" 




2^2 61 CGCCAACATC TCCCGCACAT CCCAGCCCG7 G7GCGGCAAC AACGCCCGCG CACACTCC1 ■ 
22321 CATACGAGCC GCGAACACCG CAGAACACGC CA7CAAC7CC ACACCCA7GC CCACCCACTG 
10 2 2 331 AGCACCC7GC CCGGGAAAGA CGAACACCGT ACGCGGC7GA 7CCACCGCCA CACCCA7CA~ 
224 4 1 CCGGGCA7CG CCGAACAACA CCGCACGG7G ACCGAAGACA GCACGC7CAC GCACCAACCC 
22 501 C7GCGCGACC GCGGCCACA7 CCACAGCACC CCCGCGCAGA 7ACCCC7CCA GCCGCTCCAC 
22 561 CTGCCCCCGC AGAC7CACC7 CAC7CCGAGC CGACACCGGC AACGGCACCA ACCCATCGAC 
22621 AGCCGAC7CC CCACGCGACG GCCCGGGAAC ACCCTCAAGG A7CACG7GCG CGTTCGTAC^ 
15 22 681 GC7CACCCCG AAAGCGGAGA GACCGGCZZG GCGCGGACG7 CCCGCG7CGG GCCACGCCCC 
2274 1 CGCC7CGG7G AGCAG77CCA CCGCGCCC7G GG7CCAG7CC ACATGCGACG ACGGC7CG7C 
22 801 CACATGCAGC G7C7TCGGCG CGA7GCCA7A CCGCATCGCC A7GACCA7C7 7GA7GACACC 
228 61 GGCGACACCC GCAGCCGCC7 GCGCA7GACC GATG7TCGAC 7TCAACGAAC CCAGCAGCAG 
22 921 CGGAACC7CA CGC7CCTGCC CG7ACG7CGC CAGAA7CGCG 7GCGCCTCGA 7GGGA7CGCC 
20 22 981 CAGCG7CG7C CCCGTCCCGT GCGCC7CCAC CACGTCCACG 7CGGCGGGGG CGAGCCCCGC 
2 3041 C7TGTGGAGG GCCTGGCGGA 7GACGCGC7G CTGGGAGGGG CCG77GGG7G CGGAGA7GCC 
2 3101 G7TGGAGGCG CCG7CC7GG7 7GACGGCGGA GGAGCGGACG ACCGCGAGGA CGG7GTG7CC 
2 3161 G77GCGC7CG GCG7CGGAGA GC7777CGAC GACGAGGACG CCGGCCCCC7 CGGCGAAACC 
2 3221 GGTGCCG7CC GCCGCG7CAG CGAACGCC77 GCACCGTCCG 7CCGGCGCGA CGCCGCCC7G 
25 23281 CCGGGAGAAC 7CCACGAAGG TC7G7GG7GA TGCCA7CAC7 G7GACACCAC CGACCAGCGC 
23341 CAGCGAGCAC 7CCCCGG7CC GCAGCGCC7G CCCGGCC7GG 7GCAGCGCGA CCA.GCGACGA 
2 3401 CGAACACGCC G7G7CGACCG TGACCGCCGG ACCCTCCA7G CCGAAGAAG7 ACGACAGCCG 
2 3461 7CCGGCGAGC ACCGCGGGC7 GTG7GC7G7A GGCGCCGAAT CCGCCCAGG7 CCGCGCCCG7 
2 3521 GCCG7AGCCG 7AG7AGAAGC CGCCGACGAA GACGCCGG7G TCGCTGGCGC GCAGGG7G7C 
30 2 3581 CGGCACGA7G CCGGCG7G77 CGAGCGCC7C CCAGGCGA77 7CGAGGAGGA 7CCGC7GC7G 
2 3641 CGGG7CGAG7 GCGG7GGCCT CGCGCGGAC7 GA7GCCGAAG AACGCGGCA7 CGAAG7CGGC 
2 3701 GGCGCCCGCG AG7GCGCCGG CCCGCCCGGT GGCGGAC7CG GCGGCGGCG7 GCAGCGCGGC 
2 3761 CACG7CCCAG CCGCGG7CGG 7GGGGAAG7C GCCGA7CGCG 7CGCGGCCG7 CCGCGACGAG 
23821 C7GCCACAGC 7C77CCGG7G AGG7GACGCC GCCCGGCAG7 CGGCAGGCCA 7GCGGACGAC 
35 2 3881 3GCGAGCGGC 7CGTTCGCCG CGGCGCGCA3 CGCGGTGTTC 7CCCGGCGGA GC7GCGCG7T 
2 3 941 G7CC77GACC GACG7CCGCA GCGCC7CGA7 CAGG7CGTTC 7CGGCCA7CG CC7CATCCC7 
2 4 001 .7CAGCACGTG CGCGA7GAGC GCG7C7GCG7 CCA7G7CGTC GAACAG77CG 7CG7CCGGCT 
24 061 ZZGCGGTCGT GG7GC7CGCG GG7GCC7G7G CCGG7GG7TC ACCGCCG7CC GGGG7CCCGT 
2 4 121 7G7CG7CCGG GG7CCCGT7G ACG7GCGGGG CCAGGAGGG7 CAGCAGA7GA CGGG7GAGCG 
40 2 4 131 GGZCGGGGGG GGGA7AG7CG AAGACGAGCG 7GGCCGGCAG CGGAA7GCCG AGGGCC7CGG 
2 4 241 AGAGCCGGTT GCGCAGGCCG AGCGCGG7GA GCGAG7CGAC CCCGAGG7CC 77GAAGGCCG 
24 301 7GG7GGCCG7 GACCGCCGCC GCGTCGG7G7 GGCCCAGCAG GGTGGCGGCG G7G7CGCGGA 
2 4 361 CGACGCCGAG CAGCACCTG7 7CCCG77CC7 7GTGGGGCAG G7CCGGCAGG CG77CCAGCA 
2 4 421 GGGAGCCGCC G7CGG7CGCG GAGCGCCGGG 7GGGGCGC7G GA7CGGTCGC CACAGCGGTG 
45 24 481 ACGGG7CGCC GGGCCCGGG7 GGGGCGGTCG CCACGACCAC GGCT7CCCCG G7GGCGCACG 
2 4 541 CGGCG7CGAG GAGG7CGG7C AGCCGG7CCG CCGCGGCGG7 GAACGCCACG GCCGGGAGGC 
24 601 C7TG7GCCCG GCGCAGG7CG GCCAGGGCCZ GGAGCGG7CC GGCCGCC7CG CCGGACGGAA 
2 4 661 CGGCGAGAAC GAACGCGG7C AGG7CGAGGT CGCGGGTCAG GCGG7GCAG7 TCCCAGGCCG 
2 4 721 AC7CGGCGG7 GCGGTGCGGG TGGACGACCG CGGTCACCGG GG777CCGGC AC7G7GCCCG 
50 2 4 781 GCTCG7ACCG GATCAC77CG GCGCGG7G7C CGCCGAGG7G 7CCGGCGAG7 7CC7CCGAAC 
2 4 841 GGCCCGCGAG GAGGACGG7G TCGCCG7ACG AGGCCGCGGC CG7GG7GGGC GCGGCGGGGA 
2 4 901 CGAGGCGGGG CGC77CGAGG CGCGZGZCGG CCAGGCGCAG G7GCGG7TCG 7CGAGGCGGG 
24 961 AGAGGGCGGC GGCGCGGCGG GGGG7GACCG 7G7CGG7GG7 CTCCACGAGC ACGAGCCGGC 
2 5 021 CCCG77CCGC GG7G7CGAGC AGTGCGGCGA CCGCACCGGC GAGGGGCGGG GCC7CGGCGG 
55 2 5 0S1 ACACCACCAG CGTGGCGCCG GCGG7CC7GG GG7CGTCCAG TGCGGTACGG ACCTCGTCGG 
2 514 1 GAGCGGATAG CGGGACGACG A7GAGG7CGG CCGTGGCGTC G7CGCCGAGG TCGG7G7AGC 
2 5 2 C 1 "GCGGGCCG7 GGTGCCGGG7 GCCGCCGGGG CCCGGACGCC GG7CCAGG7G GGGGGGAACA 
— -wGGCACG7C ZZGGTGCGGG CCCG7CG7GG CGGGGooG\_o vjG7GA7GAGC GAGGCuATC'I' 
2 5 321 GAGCGACCGG CCGTCCCAG7 7CG7CGGCGA GG7GCACGCG GGCGCCGCCC •TCGCCCTCGC 
60 2 5 381 CG7GGACGAA GG7GACGCGC AG777CG7GG CGCCGC7GG7 GTGGACACGG ACGCCGGTCA 
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ZGGZGGZZGC GCCGATGCTG 
7G7AGCGGGC GGCGTCCCTG 
CTCCGTGGCT CCACGCGGCG 
C 37CGAG7CG CTGGTAGAAG 
GCCCCGGCGT GGTGGCCGGT 
TCCATGTCCG GTCGCCGTCC 
CGGGCGCGGC GACGGTCACG 
TCTCGACGAC CAGTTCGTCG 
ATTCCAGGAA GGCGGGTCCG 



GClj AG GGCGC 
vjACA.TGCCGC 
gccgcgacc7 
i s-^jv^TGGTGG 
GTCCGGGCGT 
CGCACCTGGA 
AGCAGGTCGC 
G G C AG C A G T A 
~ ' "CGGCCGG 



:cgcc 

joTGC 
AG C AG CACGG 
GTCACGGCCG 
~-GCGGCGCGT 
\:AGGT GG TGT 



: - 2CCGAACGG CAACCGTACC CCCCCGTTC7 
GCGCGGTGAC GAGGAGCGCG GGG7GCAGTG 
CG7CGAGCGC GACTTCGGCG CAGACGGTGT 
"AAC7CGGG GCCGAAC7CG 7A7CCCGCG7 
CGACCGG77C CGCGTGCTCG GGCGGCCAGG 
CGA7GCCGGC GAAGCCGGAG GCGTGGCGGG 
GGACGCGCAC GGCACGGCGT CCGGTGTCGT 
ZGGCGCCGGT GGCGGGCAGG ACCAGCGGTG 
AGCCTGCCTC GTCGGCGCCG ZGTZCGGCZA 

CGGCGCCGTC GACGGAGTGA CCGGCCAGCG ATGGGTGGGT GGCCAGCGAG 
7GAGCAGCAC CTCGTCGGAG TCGGGGAGCG CCACCGACGC GGCGAGCAGC GGG7GGTCGA 
CGGCG7CGAG TCCGAGGCCG GAAGCGTCCG TGGCGGCCGC GGTCTCGATC CAGTAGCGCT 
CA7GG7GGAA GGCGTATGTG GGCAGGTCGT GTGCCGTCGC CGTCGCGGGG 
ZZZAGTCGAG GGGCACGCCG GTTGTGTGCG CCTCGGCCAG CGCGGTGAGC 
C7CCCCCGCC GCGGCGGAGC GTGGCGACGG 7CGCGCCGTC GATCGCGGGC 
ZGTGCGCGCT GACC7CGACG AACACGGTG7 CACCCGGCTC GCGGGCAGCG 
7GGCGAAGCC 7ACGGGG7GG CGCATG77GC GGAACCAG7A C7CG7CG7CG 
CGA7CCAGCG 7TCG7CGGCG G7GGAGAACC ACGGGA7CTC GGGCG7GCGC 
CCGCGACGAT CCGC7GGAGT TCGTCG7ACA GCGGG7CGAC GAACGGGG7G 7GGG7CGGGC 
AG7CCACGGC GATGCGGCGC ACCCAGACGC CGCGGGCC7C GTAGTCGGCG A7CAGCG77T 
CGA.CGCCGTC ZGGGCGCCCG GCGACGGTCG 7GG7GGTGGC GCCG7TGCGG ZCCGCGACCC 
AGACGCCGTC GATCCGGGCG GCA7CCGCC7 CGACG'TCGGC GGCCGGGAGC GCGACCGAGC 
CC A7CGCGCC GCG7CCGGCG AGT7CGCGCA GGAGCAGGAG AACGCTGCGC AGCGCGACGA 
GGCGGGCACC G7CCTCCAGG GTGAGCGC7C CGGCGACACA GGCCGCGGCG ATC7CGCCC7 
2 6881 GGGAG7G7CC GA7GACGGCG 7CCGGGCG7A CGCCCGCGGC C7CCCACACG GCGGCCAGCG 
2 694 1 ACACCA7GAC GGCCCAGCAG ACGGGG7GCA CGACG7CGAC GCGGCGGG7C ACC7CCGGG7 
27001 CG7CGAGCA7 GGCGATGGGG TCCCAGCCCG 7G7GCGGGA7 CAGCGCG7CG GCGCA77GGC 
27 061 GCATCCTGGC GGCGAACACC GGGGAGGCCG CCA7CAG77C GACGCCCA7G CCGCGCCAC7 
27121 
27131 

2724 1 CCGCGGCGA7 GGCGCGCGGG 7CG7GGCCGG GACGGGCGGC GAGGTGC7CG CGGAG7GGGC 
GGACCTGGCC G7CGAGGGCC G7GGCGG7CC GCGCCGAGAC GGGCAG7GG7 G7GAGCGGCG 
7GGCGA7CAG CGGC7CACCG GGC77CGAGG ZCGACGGCTC C7CGGCCGGC GGC7CCCCGG 
CCGGG7GGGC 77CCAGCAGG ACG7GGGCG7 7GG7GCCGC7 GACGCCGAAG 
CG3CGCGCCG CGGGCGG7CG GTCTCGGGCC AGGGCCGGGC A7CGG7GAGG 
CG7CGGCCG7 CCAG7CGACG 7GCGAGGACG GCG7G7CCAC G7GCAGGG7G 
ZZZZZTGCCG CA7GGCGAGG ACCA7C77GA 7GACACCGGC GACACCCGCG 
7G7GGCCGA7 G77GGAC77C AGCGAGCCCA GCAGCACCGG GG7G7CGCGC 
AGG7GGCCAG CACCGCC7G7 GCC7CGA7GG GA7CGCCCAG CC7GG7GCCG 
CC7CCACGGC G7CCACG7CC GCCGGGG7GA GCCCGGCG77 GGCCAGGGCC 
CCCCC7CC7G CGAGGGCCCG 77CGGCGCCG ACAACCCG77 GGAAGCACCG 
2 7 901 CCGCCGAACC CCGGACAACC GCCAGCACAC GG7GGCCG7T GCGC7CGGCA 7CGGAGAGCC 
2 7 961 7C7CGACGA7 CAGCACACCG GACCCC7CGG CGAAACCGG7 GCCGTCAGCC GCA7CCGCGA 
2 S021 ACGCC77GCA GCGCGCGTCG GGCGCGAGAC CCCGCTGC7G GGAGAAC7CG ACGAAGCCGG 
2808 1 ACGGCGAGGC CA7CACCG7G ACGCCGCCGA CCAGGGCGAG CGAGCA77CG CCGGAGCGCA 
2814 1 G7GACTGCCC GGCC7GG7GC AGCGCCACCA GCGACGACGA ACACGCCG7G 7CGACCG7GA 
CCGCCGGACC C7CCAGACCG 7AGAAG7ACG ACAGCCGACC GGACAGCACA C7GG7C7GGG 
7GCCGGTCGC GCCGAAACCG CCCAGGTCGG 7GCCGAG7CC G7ACCCG7CG GAGAAGGCGC 
CCA7GAACAC GCCGG7G7CG C77CCGCGCA 
GCGCC7CCCA CGAGG7C7CC AGGACCAGAC 



27301 
27361 
27421 
2 74 3 1 
2754 1 
2 7 601 
27661 
2 7 2 1 
2"731 
27841 



GCGG7CC77G 7CCGGGGAAG ACGAAGACGG 7GCGCGGC7C GG7GAGCGCC G7GCCGG7GA 
CGACG7CG7C G7CGAGCAGC ACGGCGCGG7 GCGGGAACG7 CG7ACGCC7G GCGAGCAGGC 
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<- z 5 6 1 
2 S 92 1 
2: 981 



GCGAC7CCGG GAGGA7CCCG GCG7G77CCA 
GC7GC7GCGG G7CCA7CGCC AGCGCC7CAC 



GCGGACTGAT CCCGAAGAAC 

G AC G CACGG 7 CGACG7GCCC 

AACCACGGTC CG7CGGAAAC 

AG7CC7CCGG CGACGCGACC 

GC7CGTCC7G CCGGACGGCC 

GZGZZGCGGT GAGC77CGCC 

GCCG 7ACGCCGG7C 



GCCGCG7CGA AG7CCGCCAC CCCGGCGAGG AAGCCACCA7 



.-. w . C3ACGCC 
s-G.-vG7 ACGGC 
v- ov; A\j AGCCG 



GAG77CC77G 
CGCGG7GCAC 
CGCGA7CCGG 



GGA7GA7CCG 
GCCG7GA7CC 
CZACCGGGCA 
GCGG7CG7GG 
GCGACGGCoC 

nAZ Lj 7 O vj ^. G \j 



A7CGGGA7C G7ACAGCCCG 
G7CACCACC CGACTCCAGC 
CCGGCAGGC CA7CCCCACG 
GCGGG7CGG CGA7GCCG7C 
CGGCGTCGG GAAG7CGAAG 
CAGCCGGA7C 
CCG7GCGGCA 



AGGCG77G 
ZZGCCTCG 



7GCCGGACGA CGGCGAGCAC G7CC7777CG 
7CGGCGAGGG 7GGTGGCGCC GGCCGCCCGG 



7CCACG7CCC 
A C C C Lj C A C r\ 
A7CGCCAACG 
CGGCCGGACA 

rv— Z\jZ jvjTvju 

Z Z Z A 7 Z AG C 
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2904 1 CCCGGCGCGG 7GCGZGZAGC AGGGGCGAGC TGCCGAGGCC GGCCGGGTCG GCGGCGA"A 

2 9101 GCGCCGGGTC CGAGGACCGC AACGCCGCGT CGAACAGCGT CAGTCCGCCT 7ZZGZGZ: 




. i^GGCGT GCCGCTGCGC GACAGCATC^ 
..964 1 GCAGCCGGCG CACGCCGTGG CGTTCGACGA GGTGGCGGCT GATGATGCCG GCCAGC3— 
29701 CGGAGCCACC GGTGACGAGC ACGGTGCCGT CCGGG7G3AG CGCCGGAGCG ^ZAZZZ~~ZG 
2 9761 GGACCGCGGG GGCCAGACGG CGGGCGTACA CCTGGCC3TC ACGCAGCACC AGCTGG"G^T 
2 9821 CATCGAGCGC GG7GGCZGCT GCGAGCAGZG GCTCGGCGGT GTCCGGGGCG GGGTCGACGA 
2 9881 GGACCA7CGG GCGGGGGTGT TCGGCCTGCG CGGTCCGCAC CAGTCCGGCG GZZGZGGZZG 
2 99-11 ACGCGAGACC GGGCCCGGTG TGGACGGCCA GGACCGCGTC GGCGTACCGG TCGTCGGTGA 
30001 GGAAGCGCTG CACGGCGGTC AGGACGCCGG CGCCCAGTTC GCGGGTGTCG TCGAGCGGGG 
30061 CACCGCCGCZ GCCGTGCGCG GGGAGGATCA CCACGTCCGG GACCGTCGGG TCGTCGAGGC 
30121 GGCCGGTCGT CGCGGTCGTG GGCGGCAGCT CCGGGAGCTC GGCCAGCACC GGGCGCAGCA 
20 30181 GGCCCGGAAC GGCTCCCGTG ATCGTCAGGG GGCGCCTGCG CACGGCGCCG ATGGTGGCGA 
30241 CGGGCCCGCC GGTCTCGTCC GCGAGGTGTA CGCCGTCAGC GGTGACGGCG ACGCGTACCG 
30301 CCGTGGCGCC GGTGGCGTGG ACGCGGACGT CGTCGAACGC GTACGGAAGG TGGTCCCCTT 
30361 CCGCGGCGAG GCGGAGTGCG GCGCCGAGCA GCGCCGGGTG CAGGCCGTAC CGTCCGGCGT 
304 21 CGGCGAGCTG TCCGTCGGCG AGGGCCACTT CCGCCCAGAC GGCGTCGTCG TCGGCCCAGA 
25 304 81 CGGCGZGZGG GCGGGGCAGC GCGGGCCCG7 CCGTGTACCC GGCTCGGGCC AGACGGTCGG 
3054 1 CGATG7CG7G GGGG7CCACC GGCCGGGCCG TGGCGGGCGG CCACGTCGAC GGCATCTCCC 
30601 GCACGGCCGG ZGCZGTCCGC GGGTCGGGGG CGAGGA7TCC GTGCGCGTGC TCGGTCCACT 
30661 ZZCCZGCCGZ GTGZCGCGTG TGCACGGTGA CZGCGCGGCG GZCGTZCGCC ZGGGGZGZGZ 
30721 TCACCGTGAC GGAGAGCGCG AGCGCACCGG ACCGCGGCAG CGTGAGGGGG GTGTCCACGC 
30 30781 TGAACGTGTC GAGGGCGCCG CAGCCGGCTT ZGTCGZZZGC CCGGATCGCC AGATCCAGGA 
3084 1 GGGCCGCGGC GGGCAGCACC GCGAGGCCGT GCAGGGAGTG CGCCAGCGGA TCGGCGGCGT 
30901 CGACCCGGCC GGTGAGCACC AGGTCGCCGG TGCCGGGCAG GGTGACCGCC GCGCTCAGCG 
30 961 CCGGGTGCGC GACCGGCGTC TGTCCGGCCG GGGZZGZGTC GCCCGCGGTC TGGGTGZZGA 
31021 GCZAGTAGZG GACCZGCTCG AACGGGTACG TCGGCGGGTG CGAGGCGCGT GCCGGCGCGG 
35 31081 GGTCGATGAC CTTCGGCCAG TCGACCGTGA CGCCGTCGGT GTGCAGCCGG GCGAGZGCGG 
3114 1 TCAGGGCGGA TCGCGGTTCG TCGTCGGCGT GCAGCATCGG GATGCCGTCG ACGAGTCGGG 
31201 TCAGGCTCGG GTCCGGGCCG ATCTCCAGGA GCACCGCCCC GTCGTGCGCG GCGACCTGTT 
312 61 CCCCGAACCG GACGGTGTCG CGGACCTGTC GTACCCAGTA CTCCGGCGTG GTGCAGGCGG 
31321 ZZZZCGCGGC CATCGGGATC CTCGGCTCGT GGTACGTCAG GCTCTCCGCG ACCTTGCGGA 
40 31381 ACTCCTCGAG CATCGGCTCC ATCCGCGCCG AGTGGAACGC GTGGCTGGTC CGCAGGCGGG 
3144 1 TGAAGCGGCC GAGCZGGGCC GCGACGTCGA GCACCGCCTC CTCGTCACCG GAGAGCACGA 
31501 TCGACGCGGG CCCGTTGACC GCGGCGATCT CCACGCCGTC CCGCAGCAGC GGCAGCGCGT 
31561 CCCGTTCCGA CGCGATCACG GCGGCCATCG CCCCGCZGGA CGGCAGCGCC TGCATCAGGC 
31621 GGGCCCGTGC GGACACCAGC CTGCACGCGT CCTCCAGGGA CCAGACGCCG GCGACGTACG 
45 31681 CGGCGGCCAG CTCGCCGATC GAATGGCCCA CGAAGGCGTC CGGGCGTACG ZCZCAZGZZT 
3174 1 CGAGCTGTGC GCCGAGTGCG ACCTGGAGCG CGAACACCGC GGGCTGGGCG TACCCGGTGT 
31801 CGTGGAGGTC GAGCCCGGCG GGCACGTCGA GGGZG7ZZAG CACCTCGCGG CGAGTGCGGG 
316 61 CGAAGACG7C GTAGGCGGCG GCCAGTCCGT CGCCCA73CC GGGACGTTGT GAGCCCTGTC 
31921 CGGAGAAGAG CCACACGAGG CGGCGGTCCG GTTCTGCGGC GCCGG7GACC GTGTCGGTGC 
50 31981 CGATCAGGGC GGCCCGGTGC GGGAAGGCCG 7GCGGGCGAG CAGGGCCGCG GCCAZCGZGC 
3204 1 GC7CGTCG7C GTCGCCGGTG GCGAGGTGGG GGCGCAGGGG G7G7ACCTGT GCGTCGAG7G 
321C1 CC7GCGGGG7 ZZGTZZZGAG AGCAGCAGGG GCAGCGG7CC GG7G7CGGGT 3GCGGGGGGG 
3 2161 GTTCGGGGGC CGGTCGGGGG TGGC7T7CGA GGATGA7GTG AGCG77GGTG CCZZ7.-S-.Z3C 
32221 CGAAGGAGGA CACGGCGGCG CGCCGTGGGC GGTCGG777C GGGCCAGGGG ZGGGCG7ZGG 
55 3 2 231 TGAGGAG77C GAZGGZGZCG GCCGTCCAGT CGACG7GCGA GGACGGCGTG TCCACGTCCA 
22 34 1 GGG7GCGCGG CAGGGTGGCG TGCCGCATGG CGAGGACCAT CTTGATGACA CCGGCGACGC 
32 401 ZCGCGGCGGZ CTGAG7GTGG CCGATGTTGG ACTTCAGCGA GCCCAGCAGC ACCGGGG7GT 
j-ii-ici CGCGATGC7G CCCG7AGG7G GCCAGTAGCG CC7GCG 2G7C GA7GGGG7CG CGCAGC27GG 
32 521 TCCCGGTGCC ATGCGCCTCG ACAGCGTCCA CATCCGZZGG GGTGAGCCCG GCGTTGGCCA 
60 32 581 GCGCZ7GZZG GATCAZCCGZ TCCTGCGACG GCCCG77CGG CGCCGACAAC CCGTTGGAAG 
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33061 
3 2121 
33151 
3324 1 
33301 
33361 



GACGAACACG 
CGACCGGACA 
GCCGCC GTCGGCTCCA GTGCCGTACC 



r;^ 1 S^tZZZ^ GTTGACCGC C GAACCACGCA CCACCGCCAG GACATTGTGG CCGTGCCGCT 
3.701 U3G^G_.,wGA GAGCCTCTCG ACGATCAGCA CACCGGATCC CTCGGCGAAA CCGGTGCC^ 
_^'/61 CAGCCGCATC CGCGAACGCC 77GCAGCGGC CGTCCGGGGA GAGGCCCCGC TGCTGGGA^" 

ztll] 2;I^:': GAA GCCGGACGGC gaggccatca CCGTGACGCC GCCGACCACG GCGAGCGAG" 
AC:CC — a GCGCAGCGAC TGCCCGGCCT GGTGCAGCGC caccagcgac 

3294 1 ccctgtccac cgtgaccgcc ggaccctcca aaccgtagaa gtacgacag^ 
"001 gcacactggt ctgggtgctg gtggcaccga aac 

CGTAG.-_-.GT A GCCGCCCATG AACACGCCGG TGTCGCTTCC GCGCAGCGAC TCCGGGAGGP 
7CCCGGCG7G TTCCAGCGCC TCCCACGAGG TCTCCAGGAC CAGACGCTGC TGCGGGTCCA 
TCGCCAGCGC CTCACGCGGA CTGATCCCGA AGAACGCCGC GTCGAAGTCC GCCACCCCGG 
CGAGG^GCC ACCATGACGC ACGGTCGACG TGCCCGGATG ATCCGGATCG GGATCGTACA 
GCCCGTCCAC GTCCCAACCA CGG7CCG7CG GAAACGCCGT GATCCCGTCA CCACCCGACT 
CCAGCAGCCG CCACAAGTCC TCCGGCGACG CCACCCCACC CGGCAGCCGG CAGGCCAT"^ 
3 3421 CCACGATCGC CAACGGCTCG TCCTGCCGGA CGGCCGCGGT CGGGG7ACGC CGCCGGG7GG 
3 3481 7GGCCCGCGC GCCGGCCAGT TCGTCCAGGT GGGCGGCGAG CGCCTGCGCC GTGGGGTGG^ 
335 4 1 CGAAGACGAG CGTAGCGGGC AGCGTCAGGC CCGTCGCGTC GGCCAGCCGG TTGCGCAGTT 
3 3 601 CGACGCCGGT CAGCGAGTCG AAGCCCACTT CCCTGAACGC GCGCGCGGGT GCGATGGCGT 
33 661 GGGCGTCGCG GTGGCCGAGC ACCGCGGCAG CGCTGGTACG GACGAGGTCG AGCATGTCGC 
33 721 GCGCGGCCGG AGG7GCGGAC G7GCGCCGGA CGGCCGGCAC GAGGGTGCGT AGGACCGGCG 
337 81 GGACCCGGTC GGACGCGGCG ACGGCGGCGA GGTCGAGCCG GATCGGCACG AGCGCGGGCC 
33841 GGTCGGTGTG CAGGGCCGCG TCGAACAGGG CGAGCCCC7G 7GCGGCCG7C ATCGGGGTCA 
3 3 901 TGCCG77GGG GGCGATGCGG GCCAGGTCGG TGGCGGTCAG CCGCCCGCCC ATCCCGTCCG 

33 961 CCGCGTCCCA CAGTCCCCAG GCGAGCGAGA CGGCGGG.CAG CCCCTGGTGG TGCCGGTGGC 

34 021 GGGCGAGCGC G T CG AG G AAC GCGTTGCCGG TCGCGTAGTT GGCCTGACCC GCGCCGCCGA 
34 081 ACGTGGCGGA TATGGACGAG TACAGGACGA ACGCGGCCAG GTCGAGATCG CGCGTCAGCT 
34 14 1 CGTGCAGGTG CCAGGCGACG TCCGCCTTGA CCCGCAGCAC GGCGTCCCAC TGCTCCGGCC 

GCATGGTCGT CACGGCCGCG TCGTCGACGA TCCCGGCCAT GTGCACGACG GCGCGCAGCC 
GCTGGGCGAC GTCGGCGACG ACTGCGGCCA GCTCGTCGCG GTCGACGACG TCGGCGGGCA 
CGTACCGCAC GCGGTCGTCC TCCGGCGTGT CGCCGGGCCG GCCGTTGCGG GACACCACGA 
CGACCTCGGC GGCCTCGTGC ACGGTGAGCA GGTGGTCCAC GAGGAGGCGG CCGAGCCCGC 
CGG7GCCGCC GGTGACGAGG ACGGTCCCGC CGG7CAGCGG GGAGGTTCCG GTGGCCGCGG 
CGACACGGCG CAGACGGGCC GCACGCGCTG TGCCGTCGGC GACCCGGACG TGCGGCTCGT 
CGCCGGCGGC GAGCCCGGCC GCTATGGCGG CGGGCGTGAT CTCGTCCGCT TCGATCAGGG 
CGACGCGGCC GGGATGCTCC GTCTCCGCCG TCCGGACCAG GCCGCCGAGC GCTTCCTCCG 
34 681 CGGGATCGCC GGTACGGGTG GCCACGATGA GCCGGGATCG CGCCCAGCGC GGC7CGGCGA 
3474 1 GCCAGG7CTG CACGGTGGTG AGCAGGTCGC GGCCCAGCTC CCGGGTCCGG GCGCCGGGCG 
3 4 301 AGGTGCCCGG GTCGCCGGGT TCCACGGCCA GGACCACGAC CGGGGGGTGC TCGCCGTCGG 
34 861 GCACGTCGGC GAGGTACGTC CAGTCGGGGA CGGGTGACGC GGGCACGGGC ACCCAGGCGA 
34 921 TCTCGAACAG CGCCTCGGCA TCGGGGTCGG CGGCCCGCAC GGTCAGGCTG TCGACGTCAA 
3 4 981 GGACCGGTGA GCCGTGCTCG TCCGTGGCGA CGATGCGGAC CATGTCGGGG CCGACGCGTT 
3 504 1 CCAGCAGCAC GCGCAGCGCG qTCGCGGCGC GCGCGTGGAT CCTCACGCCG GACCAGGAGA 
3 5101 ACGCCAGCCG GCGCCGCTCC GGGTCCGTGA AGACCGTCCC GAGGGCGTGC AGGGCCGCGT 
35161 CGAGCAGCAC GGGGTGCAGC CCGTACCGGG CGTCGGTGAG CTGTTCGGCG AGGCGGACCG 
3 5221 ACGCGTAGGC GCGGCCCTCC CCCGTCCACA TCGCGGTCAT GGCCCGGAAC GCGGGCCCGT 
35281 ACGAGAGCGG CAGCGCGTCG TAGAAGCCGG TCAGGTCGGC CGGGTCGGCG TCGGCGGGCG 
3 5 341 GCCAGTCCAC GGGCTCCGCC GGACCGCCAG TGTCCACGCT CAGCGCTCCG GTCGCACTGA 
3 5401 GCGCCCAGGG GCCCGTGCCG GTACGGCTGT GCAGACTCAC CGACCGCCGT CCGGACACCT 
3 5461 CGGTTCCGAC GGTGGCCTGG ATCTCCGTGT CGCCGTCGCC GTCGACCACC ACCGGCGCGA 
3 5 521 CGATGGTCAG CTCCGCGATC TCCGGCGTGC CGAGCCGGGC TCCCGCTTCG GCGAGCAGTT 
35581 CCACGAGCGC CGAGCCGGGC ACGATGACCC GGCCGTCCAC CTCGTGGTCG GCGAGCCAGG 
3564 1 GCTGACGGCC TACCGAGACA CCGCGGTGGC CAGCGCGCCC TCGCCGTCGG GCGAGGTCGA 
CCCACGAGCC GAGCAGCGGG TGGCCGGACG TTCCCCCCGG 



34201 
34261 
34321 
34 381 
3444 1 
34501 
34561 
34621 



J570 
35761 
35S21 
3588 1 
3594 1 
3 6001 
3 6061 



:CGCGTCG ATCCAGTAGC 
GGTCACGGCG GAACGGGTAC GTGGGCAGCG GCACCACCCG ACGCGTCGCG AAC G AC C AG 0 



TGACGGGCAC GCCCCGGACC CAGAGCGCGG 
CC7CGCZ7CG CCGCAGTGTG CCGGTGACGA 
CCAGTGCGGT GGTGAGCACG GGATGCGCGC 
GCCGG7CGCG GCGGCGAACC 



CcGv^Cr-.v^vjTG 

AGGCGGCGTC 



TCCAGCCACG 



CGAGCGACCG AGTGAAGCGG 7CCAGGCCGC 
CCCTATGCGC ATGCCCGGCG AGCGTGTCCT 
CGAC GAACGCGCGG TATCCGCGGT 
GCGCAGG TTG 
GGTGG AG AAG 



tgacc: 
gaacgg7gcg 
cctcgtccac 



rCGT/VCCAGl 



r-_r.'w00vj 



ac: 



3 6121 CCGGCGTGCG CGGAGTGATG CCGGCGAGAG CGTCGAGCAG CGCGCCGCGG ATCGTTTCGA 
36181 CATGCGCGGT GTGCGACGCG TAGTCGACGG CGA7CCGGCG GGCGCGGGGG GTGGCGGCCA 
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35. 

2624 1 G7AGCTCCTC CACGGCGTCG GCCGCACCGG CGACAACGAT CGACGCGGGT CCG^7GACC~ 

2 6^01 C3GCGACCTC CAGGCGCCCG GCCCACACGG CGGCGTCGAA GTCGGCGGGC GGCVCGAGA 
- 55 ~; r:^ TGCCGCC C ^ GCCCGGCC AGTTCGGTGG CGACGAGTCG GCTGCGCACC GCGACGACC7 

.ZZZGGCGTC GTCCAGGGTG AGCACCCC3G CGACGCAGGC CGCGGCGACT ^CG ~ / "* T *GG~ 

3 6481 AGTGGCCGAC GACCGCGGCZ GGGGCGACCZ CGTGCGCACG CCACAGCTCG GCCAG^GCC^ 
3654 1 GGATCACCGG GAACGACGCG GGCTGCACGA CATCGACCCG GTCGAACGCG GGCGC7CCGG 
2 6 601 "CGCTGGGC GATGACGTCC AGCAGGTCCC ATCCGGTGTG CGGGGCGAGC GCCGTGGCG" 
36661 AGTCGCGGAG CCGGCGGGCG AACACGGGCT CGGTGGCGAG GAGTTCGGCA CCCA7GCCGG 
36721 "CACTGGGA GCZC7GCCCG GGGAACGCGA AC AC G AC AC G TGTGTCGGTG ACGTCGGCG^ 

2 6781 77CCCGTCAC GGCCCCCGGC ACTTCGGCAC CACGGGCGAA CGCCTCCGCC TC7CGGGCCG 
3684 1 ZZACGACCGC CCGGTGGCGC ATGGCCG7CC GGGTGGTGGC GAGCGAGTGG CCGACCGCGG 

3 6 901 CC3CGGCGCC AGTGAGCGGG GCCAGC7G7C CCGCGACGTC CCGCAGTCCC TCCGGGGTCG 
3 6961 GGGCCGACAT CGGCCAGACC ACG7CC7CGG GCACCGGCTC GGCTTCGGGT GCGGACACGG 
3 7 021 G.3CGGGCGC GGCGGGGGGC CCGGCCTCCA GGACGACATG GGCGTTGGTG CCGCTGATGC 
37 081 CGAACGACGA GACACCCGCA CGCCGGGZGC GCCCGGTGAC ZGGCCACGGC TCACTGCGGT 
37 14 1 GGAGCAGCCG GATGTCGCCG TCCCAGTCGA CGTGCCGGGA CGGCTCGTCG ACGTGCAGCG 
37 201 T ZZGCGGCAG GACGCCGTGC CGCATCGCCA TGACCATCTT GATGACGCCG GCGACGCCGG 
3 7261 GGGCGGCCTG GGTGTGGCCG ATGTTCGACT TGAGCGAGCC GATCAGCAGC GGATGCACGC 
3 7 321 G77CGCGCCC GTAGGCCACT TGCAGGGCCT GGGCCTCGAC GGGGTCGCCG AGACGGGTGC 

2 7 381 C3GTGCCGTG TGCCTCCACG GCGTCGACGT CACCCGGCGC CAGGCCGGCG TCGGCGAGCG 
374 4 1 CACGCTGGAT GACGCGCTGC TGCGCAGGCC CGTTCGGGGC GGACAGCCCG TTCGACGCGC 
37 5C1 CGTCGGAGTT GACCGCGGAG CCGCGCACCA GCGCGAGCAC GGGGTGGCCG TGGCGGGTGG 
37 561 CG7CGGAGAG CCGCTCCAGC ACCAGGACAC CGGCGCCCTC GGCGAAGCTC GTGCCGTCCG 
37 621 CGGTGTCCGC GAAGGCCTTG GCACGGCCGT CGGGGGCGAG CCCGCGCTGC CGGGAGAACT 

3 7 681 CGACGAACCC GGTCGTCGTC GCCATCACCG TGACACCGCC GACCAGGGCG AGCGAGCACT 
37 741 CCCCCGAGCG CAGCGACCGC GCGGCC7GGT GCAGCGCCAC CAGCGACGAC GAACACGCCG 
37801 7GTCGACGGT GACCGACGGG CCCTCCAGAC CGAAGTAGTA CGAGAGCCGC CCGGAGAGAA 
3-7861 CGCTGGTCGG CGTGCCGGTC GCCCCGAAAC CGCCCAGGTC CACGCCCGCG CCGTAGCCC7 

37 921 GGGTGAACGC GCCCATGAAT ACGCCGGTGT CGCTGCCGCG GACGCTTTCG GGCAGGATGC 
3798 1 CCGCTCGTTC GAACGCCTCC CACGACGCTT CGAGGACCAG ACGCTGCTGC GGGTCCATCG 
3604 1 CCAGCGCCTC ACGCGGGCTG ATCCCGAAGA ACGCGGCGTC GAAGTCGGCG GCGCCGGTGA 
38101 GGAAGCCGCC GTGACGCACG GAAACC773C CGACCGCGTC GGGGTTCGGG TCGTAGAGCG 
38161 CGGCGAGGTC CCAGCCGCGG TCGGCGGGGA ACTCGGTGA7 CGCGTCCCCG CCGGAGTCGA 
38221 CCAGCCGCCA CAGGTCCTCC GGTGACCGCA CGCCACCGGG CATCCGGCAC GCCATGGCCA 
33231 CGATCGCCAG CGGCTCGTTC CCCGCCACCG TCGGTGCGGG CACTGTCGCC GCCGGAGCGG 

38 34 1 CAGGGGCCGC C7CACCCCGC CGTTCC7CA7 CCAGGCGGGC GGCGAGCGCG GCCGGTG7CG 
384 01 GG7GGTCGAA GACGGCCGTC GCGGAGAGCC GTACCCCCG7 CGTCTCGGCG AGGC7G7TGC 

35 4 61 GCAACCGGAC ACCGCTGAGC GAG7CGA7GC CGAGGTCC77 GAACGCCG7C GTGGGCGTGA 

36 521 777CGGAGGC GTCGGCGTGG CCGAGCAC3G CGGCCGTGGC CGCACACACG A7GGCCAGCA 
33 561 GG7CACGATC GCGGTCGCGG TCGCGG7CGC GGTTGTCCTC CGCACGGGCG GCGA7GCGGC 
36 641 CCTCGGTCCG C7GCCGGACG GGCTCGG7GG GAATCGCCGC GACCATGAAC GGCACGTCCG 
38701 CGGCGAGGCT CGCGTCGATG AAGTGGG7GC CCTCGGCCTC GGTGAGCGGC CGGAACCCG7 
387 61 CGCGCACCCG CTGCCGGTCG GCGTCGTCAA GTTGTCCGGT GAGGGTGCTG GTGG7GTGCC 
38821 ACATGCCCCA GGCGATGGAG GTGGCGGG7T GGCCGAGGG7 GTGGCGGTGG GTGGCGAGGG 
38881 C3TCGAGGAA GGCGTTGGCG GCGGCGTAGT TTCCTTGTCC GGGGCTGCCG AGGACGGCGG 
3894 1 CGGCGCTGGA GTAGAGGACG AAGTGGGTGA GGGGTTGGT7 TTGGGTGAGG TGG7GCAGG7 
3 9001 GCCAGGCGGC G77GGCTTTG GGG7GGAGGA CGGTGGTGAG GCGGTCGGGG GTGAGGGCG7 
3 9061 CGAGGATGCC GTCGTCGAGG GTGGCGGCGG TGTGGAAGAC GGCGGTGAGG GC77GGGGGA 
3 5121 7G7GGGCGAG GG7GGTGGCG AGTTGGTGGG GGTCGCCGAC GTCGCAGGGG AGGTGGGTGC 
29181 C GGGGGTGG7 G7CGGGGGGT GGGG7GZGGG AGAGGAGGTA GGTGTGGGGG TGGT7CAGG7 
3924 1 GGCGGGCGAG GATGCCGGCG AGGG7GCCGG AGCCGCCGG7 GATGATGATG GCGTGTTCGG 
39301 GG7TGAGGGG GGTGGTGGTG GGTGGGG7GG TGGTGTGGAG GGGGGTGAGG TGGGG7CGG7 
39361 GGAGGGTGTG G7GGGTGAGG CGGAGG7GGG GGTGG7CGAG GGTGGCGAGT TGGGCCAGGG 

2 9421 GGAGGGGAG7 G7GGGGG7GG TCGG777CGA TGAGGCGGA7 GCGGTGGGGG TC77CG77C7 
394 8 1 GGGCGGTGCG GGTGAGGCCG GTGACGG7GG CGCCGGCGGG G7CGGTGG7G G7GTG3ACGA 
3954 1 7GAGGG7G7G G7CGGTGGTG GTGAGG7G3T GT7GCAGGGC GGTCAGGACG CGGG7GGCGC 
3r6C". 3GC7GTGGGC GCGGGTGGGT ATGTCC7C3G GGTCGTCGCG OTGGGCGGCG G7GA.7CAGGA 
-■ ?ct; ~ _;7CTCCCTC GGGCAGGTCA CCG7CG * A\jA CCGCCTCGGC G ACCGCGACC C.--C . 2CAACC 

3 9721 GGAGCGGGTT CGGCCCCGAC GGGG7G7CGG CCCGCTCCCT CAGCACCAGC GAG7CCACCG 
3 9781 ACACGACAGG ACGGCCATCC GGGTCGGCCA CGCGCACGGC GACGCCGGCC TCCCCCCGGG 
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4 314] 
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TGAGGGCGAC 
AC3GCAGCTC 
GTGCCGGATG 
7 3 GC AT AC AC 
ACTCATAACC 
i 'jGCGGGCGG 
GGGTCAGGGT 
ZZGTCACCGG 
C73CGGTCAC 
CGGTCTCGTC 
TZZCCCGCAC 
GAACAACACC 
ZZZZCCCGGT 
GCCTGTGCTC 
CCGTGCCCCA 
GZZCCCAGCC 
GCAGCAGCAC 
CCGCATCCAG 
CGGTCACCCA 
■ 77CCCTTCAG 
CG7AGTCGAC 
CZZCCACCGC 
TCCACACACC 
CZZGGCCGGC 
CC7CCAGGCT 
CCACAGCGTC 
CCCAGCTGGC 
CCCGCACATC 
CGAACACCGC 
CGGGGAAGAC 
CCAGCAGCAC 
CGGCCACATC 
GACTCACCTC 
CACGCGACGG 
ACGACGACAC 
GCAGCTCCAC 
737TCGGCGC 
.CAGCCGCCTG 
GG7CCTGCCC 
CGGTGCCGTG 
CCTGCCGGAT 
CGTCCTGGTT 
CGTCGGAGAG 
CCGCGTCGGC 
CCACGAGCTC 
CCCCGGCCCG 
7GTCGACCGT 
CGCTCGTCTG 
GGTTGAACGC 
CGGCGTTCTC 
CZAGCGCCTC 
A7CCGCCGTG 
CGACGTCCCA 
GZZGCCACAG 
- -~ wCG ACGGG 
CG3CGAGCTG 
C33GCAGACC 
G3CCG77CTC 



GCGCACCGZG 
GATCCZGCCG 
CACACCGAAA 



GGCATCCCGC 
CCACTGCGAG 
GCCGCTGGCG 
CCGCCGTCCG 
CGGCACCACG 
ACCGGCCCGG 
CGCGTGATCA 
ACCACCGTCG 
CAGCCCGGCC 
GAACGCGTAG 
GTCCACCCCC 
ACCGTCACCA 
CGGATGGGCA 
CGCGACAGGG 
GGCGCTGTCC 
TACCTCAGCG 
CGCGATACGA 
CGACGGGTCC 
CTCGACCAGA 
CAGCCGCGCC 
GAGGGCTCCG 
CGGCACGACC 
CGGCTGGACC 
CCAGCCCGTG 
GGAACGGTCC 
GAACACCGTA 
CGCACGGTGA 
CACCCCACCC 
ACCACGAGCC 
CCCAGGAACA 
ACCCGCATGC 
CGCACCGGCC 
GATCCCATGC 
CGCA7GACCG 
GTAGGCCGCG 
CGCCTCCACC 
CACGCGCTGC 
CACCGCCGAG 
CCGCTCCAGC 
GAACGCCT7G 
TGCGGTGTTC 
CAGTGCCTGT 
GACCGCCGGG 
CGTCGCCGTG 
GCCCATGAAC 
GAACGCCTCC 
GTTCGGACTG 
GCGTGTCGTG 
GCCCCGGTCG 
GTCCTCCGGC 
GTCGCCGGAG 
GGCGGCGAAC 



tjCGvjCCCCGG 
CCCGCG7CGA 
CCG7CCGCC7 
- C.—.\_ w w^,/-.Go 
AG77CGTCAT 
AACGGCTCAC 
TGCCGGG7CC 
GCCTCATCGG 
AGCGGGGA77 
ATGACCAGCT 
GCCAGCCAGG 
TCGGCGGGCA 
GCGGACAGGT 
GTGGGCAGA7 
GCACCCAGAG 
GTCCGCAACG 
CTGCACTCCA 
CGACGCAGGT 
ACGGTCGACC 
AGTTCGTCCT 
CGCACCCGCA 
CCCGCCACCA 
CCCACC7CAC 
GCGATCACCC 
GCCACACACG 
CCATGCGCCT 
ACCTCCACCC 
TGCGGCAACA 
ATGAGTTCCA 
CGCGGCTGAT 
CCGAAGACAG 
CCGCGCAGAT 
GACACCGGCA 
CCC7CCAGGA 
GGTGCCCGAT 
GACCAGTCCA 
CGCATCGCCA 
ATGT7CGACT 
AGGACGGCC7 
ACG7CCACAT 
TGGGCGACGC 
CCGCGGACGA 
ACGAGAACGC 
CACCGTCCGT 
GCCA7GACGG 
GCCGCCTGGT 
CCCTGAAGTC 
ACACCGAGCC 
ACGCCGGTGT 
CAGGAGGTCT 
ATGCCGAAGA 
GAGCGGCCGG 
GTGGGGAACT 
GAGGCGACCC 
CCGAGGG7CT 
GCAC G CGG AG 



ICGGAACG7G 



^ _ ■ J o t o . 



7GGCG7TCAG 
GGCGCCCGGC 
CGGCGGCCTG 
C — .GCC CGC AA 
AGAACCCCGA 
CGGAAGCGTT 
AGCTGCCCGT 
CCCC7TCCAC 
CGATGACCAG 
CCACAAACGC 
GA7GCGTACG 
GTGCTGTGAC 
CGG7GGCACC 
CCAGCAGCCG 
TCCACGCCTC 
ACGCCACCGT 
CGAACACCGA 
7CCGGTACCA 
ACCACGCCAC 
CGATGGCCTC 
CCCCATCAGC 
CCGTCGAAGC 
CGGCCGGCAA 
GACTGCGCAA 
CCGCCGCGAT 
GCCACAGCGC 
GC7CCGCCAC 
ACGCCCGCGC 
CGCCCATGCC 
CCACCGCCAC 
CACGCTCACG 
ACCCCTCCAG 
ACGGCACCAA 
TCACGTGCGC 
CCGACTCGGG 
CA7GCGACGA 
TGACCATCTT 
7GACCGAACC 
GCGCCTCGAT 
ZGGCGGCGCG 
CGTTGGGGGC 
CCGCGAGAAC 
CGACGCCCTC 
CCGGGGAGAG 
TGACACCGCC 
GCAGGGCGAC 
CGTACACGTA 
CGCCCAGGTC 
CGCTCTCCCG 
CCAGGATCAG 
ACGCGGCGTC 
CCGCGTCCGG 
CGG7GATCGC 
CGCCGGGCAG 
G3GCGG7CGC 
TGGGGTGGTC 
7G3TGAAC7C 
/ i-j\_rtG7vj7CC 



GCGCACGCCC 
GTGCAGGGCC 
CTCGTCGGGC 
CCCCTGGAAC 
GACG7CGACG 
GGAGG7ATCC 
GCCCTCGGTA 
GGTCACCGAC 
TTCATCCACC 
CGTACCCGGC 
CAATGAGATC 
GGCGGCCAGC 
GGCCGCCTCC 
CCCCGGCACC 
CGCCAACGCC 
GCGGGCCTGT 
CCCGTCCAGC 
GTACCCCTCA 
CGACCCGG7C 
CACGTGAGGC 
CTCATACCGC 
CGGACCATTA 
CGCCACCGAA 
CGCCACCACG 
CTCCCCCTGC 
GGCCAGGCTC 
ATCCGACCGC 
ACACTCCTCC 
CACCCACTGG 
ACCCATCACC 
CACCAACCCC 
CCGCTCCACC 
CCCATCACCA 
GTTCGTACCG 
CCACGGCCTC 
CGGCTCGTCC 
GATGACACCG 
GAGGTAGAGC 
CGGGTCGCCC 
CAGTCCGGCG 
GGACAGTCCG 
GGTGTGCCCG 
GGCGAAGCCG 
TCCGCGCTGC 
GACCAGCGCC 
CAGCGACGAC 
CGAGAGGCGC 
CCGGCCGACG 
GAGCCTGTCC 
GCGCTGCTGG 
GAACCCGGCG 
GTCCGGGTCG 
CTCGGTACCG 
TCGGCACGCC 
GGGTGCCGCT 
GAACGCGGTT 
G ACGG7GG7G 
GGGGGCCGGZ 



i cc ag gaga 

- C o T C G A G C A 
AGCGCCACC7 

rrrr.~ 



-iZZGCGGCCG 
\jGGGTGTCGG 
CGCGCGTGGA 
ACATCCACCG 
AC7CCGCAAC 
AGCAGAACCG 
CGGCCGGTGA 
ATCGGA7GCG 
AGCCAGTACC 
GG7TCGACCA 
CCCAGCCACC 
TCCATCGCCG 
TCCGCCACCG 
TCCACCGGC7 
CCGCCGGAAA 
GTGTGGGAGG 
GCCACCACCT 
CGCGCCGCGA 
GCCATCGCCC 
CGGGCGGCGT 
GAGTGTCCGA 
ACCGCGACCG 
GACAACATCT 
ATACGAGCCG 
GCACCCTGCC 
CGGGCATCAC 
7GCGCGACCG 
7GCCCCCGCA 
CCCGACTCCA 
C7CACCCCGA 
GCCTCGGTGA 
ACGTGCAGCG 
GCGACACCCG 
GGCGTGTCGC 
AGCCGCGTGC 
7TGACCAACG 
T7GGAGGCAC 
TTGCGCTCGG 
GTCCCGTCCG 
CGGGAGAACT 
AGGGAGCACT 
GAGCACGCCG 
CCGGACAGGA 
CCGTAGCCCT 
GGCACGATGC 
GGGTCCATCG 
CCGGCCAGGA 
7ACAGCGCGT 
GCGGCGACGA 
A T G C C G AC G A 
GTCGCGGAGC 
GACGCGGGC A 
AGCGAGTCGA 



4 3 321 CGGTGGCGAC GCTG7CGCGG ACCAGGTCGA GCAGTACGTC CTCCCGGCCC GCACGGGCCG 
4 3381 ZZGCGAGGCG GTTCGCCCAC TCC7G7TCCG TGGCGTCGGG CTCGGCCGGT CCGGTCAGTG 
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4 34 A I 
4 3 5 C 1 
4 35ei 
J 36:i 
4 3 66 1 



CGCTGAG j .-. . 

GCGCCGGCCG 



•o o unnu v, o 

CCCTCCGCGG 

- j'vACcA-jC.; 

GTCTCG7CGC 
GCGTCCGGGC 
CTCATCC 



4 4 041 ACCGGCCGCC 
4 4 101 TGAGCACGAC 
44161 CA7GG7CGG7 
4 4 221 CGTACACCTG 
4423 



CGGCGGCGTG GCGCCGGCCA TCGTCGCGGC CC3CGCCCCG 
GATGTACGAG CCGCCGCCCG CGATGGCCTT C7CGATCAGG 
TTCGATGCCG GGCAGCGCGC GGACGGTGAC GGTGGGGAGT 
CCCG7GGCC3 GGTGTGGGCG TCGGCGCCGG CCGGGCCGTC GAGCAGGACG 
CGCCGGGG77 CGCGGCT7CC TCGGCTGCGG TGGTCACGTG GG7GAGGCCG 
4 3741 GGAGCAGGGZ GGCGACGGTG TCGGCGTCCT CCCCGGTGAC CAGGACCGGC 
4 3801 CGA7CGGAGG CGGCACGGTG AGGACCATC7 TGCCGGTGTp CCGGGCGTGG 
4 3861 CGAACGCG7C CCGCGCACGG CGGATGTCCC ACGGC7GCAC CGGCAGCGGG r AC AG C^^ AC 
4 3921 CGCGGTCGAA CAGGTCGAGG AGCAGTTCGA GGATCTCCCG CAGGCGCGCG GGATCCA^G T 
'13981 CGGCCAGG7C GAACGGCTGC TGGGCGGCGT GGCGGATGTC GG7CTTGCCC ATCTCGACGA 
CGGTGCGAGC AGGCCGATGG ACGCGTCGAG GAGTTCACCG GTGAGCGAGT 
GTCGACCGGC GGGAAGGTGT CGGCGAACGC GGCGCTGCGG GAGTTCGCCA 
GTCGAAGCCG TCGGCGTGCA GCAGGTG7TG TTTGGCGGGA C7GGCGG~GG 
GGCGCCGAGG TGGCGGGCGA TCCGGGTCGC CGCCATGCCG ACACGGCZCG 
GCGGCG7G GACCAGGACC 77CTGGCCGG G7CGCAGCTC GCCCGCGTCG ACGAGGCCG7 
4 4 34 1 ACCAGGCGG7 GGCGAACACG ATGGGCACGG ACGCGGCGAT GGGGAACGAC CA7CCCCG' T, G 
4 4401 GGATCCG7GC GACCAGCCGC CGGTCCGCGA CCACGC7GCG CCGGAACGCG TCCTGCACGA 
4 4 461 GACCGAACAC GCGGTCGCCG GGGGCCAGGT CGTCGACGCC GGGTCCGACT TCGGTCACGA 
4 4 521 TGCCCGCGGC CTCCCCGCCC ATCTCGCCC7 CGCCCGGG7A GG7GCCGAGC GCGATCAGCA 
4 4 581 CGTCGCGGAA GTTCAGCCCC GCGGCGCGGA CGTCGATGCG GACCTCGCCG GCGGCCAGGG 
4 4 641 GCGCGGCGGG ACGTCGAGCG GGGCGACGAC GAGGTCGCGG AGCGTTCCGG AGGCGGGCGG 
4 4 701 GCGCAGCGCC CACTGGCGCG GTCGGCAGGG GGGTGGTG7C CGCGCGTACC AGCCGGGGCA 
4 4 761 CG7AGGCCAC GCCGGCCCGC AGCGCGATCT GGGGTTCGCC GAGCGAGGCC GCGGCGGGGA 
4 4 821 CGAGGTCC-TC ATCGCCGTCC GTGTCCACCA GCACGAACGA TCCGGGTTCG GCGGCC7GGC 
4 4 881 GGCGCAGCGC CTCGTCCCAG AGCCGGGCCT GGTCCGCGTC CGGGATCTCG GCCGGGCCGA 
4 4 941 CGCCCACCGC GCGGCGGGTG ACGACCGTCC GGCGGGGTGA CGGGGTGCCG GGCAGGTCGC 
4 5001 GCCGCTCCCA GACCAGTTCG CACAGCGTGG CCTCGCCACT GCCGGTGGCG ACCAGA7GGG 
4 5061 CCGGCAGCCC CGCGAGCCGC GCGCGCTGGA CCTTGCCCGA CGCGGTGCGG GGGA7CG7GG 
4 5121 TGACGTGCCA GATCTCGTCG GGCACCTTGA AGTAGGCGAG CCGGCGGCGG CACTCGGCGA 
4 5181 GGATCGCCTC GGCGGGGACG CGGGGGCCGT CGGAAACGAC GTAGAGCACG GG7ATG7CGC 
4 5241 CGAGGACGGG GTGCGGGCGG CCCGCCGCGG CGGCGTCCCG GACACCGGCC ACC7CC7GGG 
4 5301 CGACGGTCTC GATCTCCCGG GGGTGGATGT 7CTCCCCGCC GCGGATGA7C AGCTCC77GA 
4 5361 CCCGGCCGG7 GATCG7CACG TGTCCGGTCT CGGCCTGACG TGCGAGGTCC CCGG7GCGG7 
4 5421 ACCAGCCG7C CACGAGCACC TGGGCGGTCG CCTCCGGCTG GGCGTGGTAG CCGAGCA7GA 
4 5481 GGCTCGGCCC GCTCGCCCAC AGCTCGCCC7 CCTCGCCGGG TGCCACG7CG GCGCCGGACA 
4 5 541 CCGGGTCGAC GAACCGCAGC GACAGGCCCG GCACGGGCAG CCCGCACGAG CCGGGAACCC 
4 5 601 GCGCATCC7C CAGGG7GTTG GCGGTGAGCG AGCCGGTCGT CTCGGTGCAG CCG7ACG7G7 
4 5661 CGAGCAGGGG CACGCCGAAC GTCGCCTCGA AATCCCTGGT GAGCGACGCC GGCGAGG7GG 
ATCCGGCGAC CAGCGCCACG CGCAGCGCGC GAGCCCGCGG CTCGCCGGAC ACGGCGCCGA 

GGAG7GT TCGGCCAGGG 
CCGACCG7GA 

GGTGGCCGA GGCTGTGGAA CAGCGGGGCG GGCCAGAGCA 
4 5961 GTTCGTCG7C CTCGG7CAGC CGCCAGGACG GCACGTCGCA GTGCATCGCG G AC C AC AG GC 
4 6021 CGC7GCGC7G TGCGGAAACC ACGCCCTTGG GACGGCCGGT GGTGCCGGAG G7G7AGAGCA 
4 6081 TCCAGGCGGG TTCGTCCAGG CCGAGGTCG7 CGCGGGGCGG GCACGGCGGC TCGGTCCCGG 
4 6141 CGAGGTCCTC GTAGGAGACG CAGTCCGG7G CCCGGCGCCC GACGAGCACG ACGG7GGCGT 
4 6201 CGGTGCCGGT GCGGCGCACC 7GG7CGAGG7 GGGTT7CGTC GGTGACCAGC ACGGTCGCGC 
4 6261 CGGAGTCCGT CAGGAAGTGG GCGAGTTCGG CGTCGGCGGC G7CCGGG7TG AGCGGGACGG 
4 6321 CGACGGCGGC GGCGCGGGCG GCGGCGAGGT AGACCTCGAT GG7C7CGA7C CGGTTGCCGA 
4 6381 GCAGCA7CGC GACCCGGTCG CCGCGGTCGA CGCCGGACGC GGCGAGG7GT CCGGCGAGCC 
GGCCGGCCCG GAGCCGGAGT TGCG7GTACG 7CACGGCGCG 77GGGAATCC G7G7AGGCGA 
7CCGGTCGCC GCGTCGCTCG GCA7GGA7GC GGAGCAATTC G7GCAACGGC CGGA77GG7T 
A7GGAAACA CCT77CTC7C GACCAACCGC ACAACAGCAC GGAACCGGCC 



Ail 

15781 GGAGGTAGCG GTACA7CGTC GGCACGCCGA CGAGCACGGT 

4 5841 CG7CGAGGAC GTCACGCGCG ACGAAGCCGC CCAGGA7ACG 

4 5901 GGACGGCGAG CAGGCAGAGG 



GGCGGACGCG 



46441 
4 6.501 
4 6561 CCACACCCGC 



ACGAGTAGAC GCCGGCGACG C7AGCAGCG7 777CCGGACC GCCACZCCCT GAAGA7CCCC 
CTACCG7GGC CGGCC7CCCC GGACGCTCA7 C7AGGGGGT7 GCACGCA7AC CGCCG7GCGT 



4 6621 
4 6631 

4 6741 AA77GCC77C C7GA7GACCC ATGCCGGACG CCAGGGAAGG G7GGAGGCGT 7C7CCA7A7C 
TGTCACGCCG CCGTAT7GCC GC7TCGAGAA GACCGGA7CA CCGGACCTCG AGGGTGACGA 



- ATCG AGCA 1 



jCCACA.CC ga 



4 6 

4 65 c 
4692 

4 69S1 GCACGCACAG CGCCCTG7CG AG7CCGGCA7 GGACAACGGC A7CGCC7GGG CCCGCACCGA 



TGCTCCCCGG ACCGCGGTGC ACACCACGAC CCGTGACGAC GAGGCGTTCA CCGAGGTC" 
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^SSI ACCTG TT -GGTGTCG TGCGCACCGG CGAGAGCGGC AGG7ACGCCG ATGCCA^CGr 
4*i01 GGw^TCTAC ACGAACGTC7 TCCAGCTCAC CCGGZZGCTG GGG7A7CC-- TGC^C^^CG 
4 lit} oACC " GAAC TACGTCAGCG GTATCAACAC GACGAACGCG GACGGGCTGG AGGTGTPCCr: 
A LZZ- GTGGGCCGZG CCCAGGCGZT CGACGAGGGC GGGATCCACC rGGCCAC^ 1 

:> -.-dl G-,CC»-CGGCC ACCGGTATCG. GCGCCCACGG GGGCGGCATC ACCTGCGTGT TCCTCGCCG^ 
4/341 CCGGGGCGGA GTGCGGATCA ACATCGAGAA CCCCGCCGTC CTCACGGCCC ACCACTACCC 
4 7401 GACGACGTAC GGTCCGCGGC CCCCGGTCTT CGGAGGGGCC ACCTGGCTGG GCCCGCCGGA 
4 74bl GGGGGGCCGG CTGTTCATCT CCGCGACGGC CGGCATCCTC GGACACCGAA CGGTGCACCA 
4 7 521 CGGTGATGTG ACCGGCCAGT GCGAGGTCGC CCTCGACAAC ATGGCCCGGG TCATCGGCGC 
10 4 7 581 GGAGAACCTG CGGCGCCACG GCGTCCAGCG GGGGCACGTC CTCGCCGACG TGGACCACCT 
4 7 641 CAAGGTCTAC GTCCGCCGCC GCGAGGATCT CGATACGGTC CGCCGGGTCT GCGC^GCACG 
4 7 701 CCTG7CGAGC ACCGCGGCCG TCGCCCTTTT GCACACCGAC ATAGCCCGCG AGGATCTGCT 
4 7761 CGTCGAAATC GAAGGCATGG TGGCGTGACA ATACCCGGTA AAAGGCCCGC GACGGTGCGC 
4 7 321 C7CGGCGGAT CCGCGAAGAG AAAGAAGAGC GTCACCGCAC AGCGCGGCAG CCCGGTCCTT 
15 4 7881 TCGTCCTTCG CACAGCGGCG GATCTGGTTT CTCCAGCAAT TGGACCCGGA GAGCAACGCC 
4 7 941 TATAATCTCC CGCTCGTGCA ACGCCTGCGC GGTCTATTGG ACGCGCCGGC CCTGGAGCGT 
4 8001 GCGCTGGCGC TCGTCGTCGC GCGCCACGAG GCGTTGCGGA CGGTGTTCGA CACCGCCGAC 
4 8 061 GGCGAGZZZC TCCAGCGGGT GCTTCCCGCC CCGGAACACC TCCTGCGCCA CGCGCGGGCG 
4 8121 GGCAGZGAGG AGGACGCCGC CCGGCTCGTC CGCGACGAGA TCGCCGCGCC GTTCGACCTC 
4 8181 GCCACCGGGC CGTTGATCAG GGCCCTGCTG ATCCGCCTCG GTGACGACGA CCACGTTCT r 
4 8241 GCGGTGACCG TGCACCATGT CGCCGGCGAC GGCTGGTCGT TCGGGCTCCT CCAACATGAA 
4 3 301 CTCGCAGCCC ACTACACGGC GCTGGGCGAC ACTGCCCGCC CTGCCGAACT GCCGCCGTTG 
4 8 361 CCGGTGCAG7 ACGCCGACTT CGCCGCC7GG GAGCGGCGCG AACTCACCGG CGCCGGACTG 
4 8421 GACAGGCG7C TGGCCTACTG GCGCGAGCAA CTCCGGGGCG CCCCGGCGCG GCTCGCCCT^ 
4 8481 GCCACCGACC GTCCCCGCCC GCCGGTCGCC GACGCGGACG CGGGCATGGC CGAGTGGCGG 
4 8 541 CCGCCGGCCG CGCTGGCCAC CGCGGTCCTC ACGCTCGCGC GCGACTCCGG TGCGTCCGTG 
4 8 601 TTCATGACCC TGCTGGCGGC CTTCCAAGCG GTCCTCGCCC GGCAGGCGGG CACGCGGGAC 
4 8 661 GTGCTGGTCG GCACGCCCGT GGCGAACCGT ACGCGGGCGG CGTACGAGGG CCTGATCGGC 
4 8721 ATGTTCGTCA ACACGCTCGC GCTGCGCGGC GACCTCTCGG GCGATCCGTC GTTCCGGGAA 
30 4 8 781 CTCCTCGACC GCTGCCGGGC CACGACCACG GACGCGTTCG CCCACGCCGA CCTGCCGTTC 
4 8841 GAGAACGTCA TCGAACTCGT CGCACCGGAA CGCGACCTGT CGGTCAACCC GGTCGTCCAG 
4 8 901 GTGCTGTTGC AGGTGCTGCG GCGCGACGCG GCGACGGCCG CGCTGCCCGG CATCGCGGCG 
4 8 961 GAACCGTTCC GCACCGGACG CTGGTTCACC CGCTTCGACC TCGAATTCCA TGTGTACGAG 
4 9021 GAGCCGGGTG GCGCGCTGAC CSGCGAACTG CTCTACAGCC GTGCGCTGTT CGAGGAGCCA 
35 4 9081 CGGATCACGG GGTTGCTGGA GGAGTTCACG GCGGTGCTTC AGGCGGTCAC CGCCGACCGG 
4 914 1 GACG7ACGGC 7G7CGCGGC7 GCCGGCCGGC GACGCGACGG CGGCAGCGCG CG7GG7GCCC 
4 9201 7CGAACGACA CGGCGCGGGA CCTGCCCG7G GACACGC7GC CGGGCC7GC7 GGCCCGG7AC 
4 9261 GCCGCACGCA CCCCCGGCGC CG7GGCCG7C ACCGACCGGG ACA7C7CCC7 CACC7ACGCG 
4 9321 CAGC7GGACC GGGGGGCGAA CCGCC7CGCG CACC7GC7CG GCGCGCGCGG GACCGCCACC 
40 4 9381 GGCGAGC7GG 7CGGGA7C7G CGCCGATCGC GGCGCCGACC 7GA7CG7CGG CA7CG7GGGG 
4 9441 A7CC7CAAGG CGGGCGCCGC T7A7G7GGCG C7GGACCCCG AACA7CC7CC GGAGCGCACG 
4 9501 GCG77CG7GC 7GGCCGACGC GCAGCTGACC ACGG7GG7GG CGCACGAGG7 C7ACCGT7CC 
4 9561 CGGT7CCCGG A7G7GCCGCA CG7GG7GGCG 77GGACGACC CGGAGC7GGA CCGGCAGCCG 
4 9621 GACGACACGG CGCCGGACGT CGAGCTGGAC CGGGACAGCC 7CGCC7ACGC GA7C7ACACG 
45 4 9681 7CCGGGTCGA CCGGCAGGCC GAAGGCCG7G CTCA7GCCGG G7G7CAGCGC CG7CAACC7G 
4 9741 C7GC7CTGGC AGGAGCGCAC GA7GGGCCGC GAGCCGGCCA GCCGCACCG7 CCAG77CG7G 
4 9801 ACGCCCACG7 7CGAC7AC7C GG7GCAGGAG ATC7777CCG CGC7GC7GGG CGGCACGC7C 
4 9861 G7CA7CCCGC CGGACGAGGT GCGGT7CGAC CCGCCGGGAC 7CGCCCGG7G GA7GGACGAA 

4 9921 CAGGCGA77A CCCGGATC7A CGCGCCGACG GCCG7AC7GC GCGCGC7GA7 CGAGCACG7C 
50 4 9981 GA7GCGCACA GCGACCAGC7 CGCCGCCC7G CGGCACC7G7 GCCAGGGCGG CGAGGCGC7G 

500 4 1 A7CC7CGACG CGCGG77GCG CGAGC7G7GC CGGCACCGGC CCCACC7GCG CG7GCACAA7 
50101 CACTACGG7C CGGCCGAAAG CCAGC7CA7C ACCGGG7ACA CGC7GCCCGC CGACCCCGAC 
50161 GCGTGGCCCG CCACCGCACC GATCGGCCCG CCGA7CGACA ACACCGGGA7 CCA7C7GC7C 
50221 GACGAGGCGA TGCGGCCGG7 7CCGGACGG7 A7GCCGGGGC AGC7C7GCG7 CGCCGGCG7C 
55 50281 GGCCTCGCCC G7GGG7ACC7 GGCCCG7CCC GAGC7GACGG CCGAGCGG7G GG7GCCGGGA 
5034 1 GA7GCGG7CG GGGAGGAGGG CA7G7AGC7C ACGGGCGACC TGGCCCGGGG CGCGCCCGAC 

5 0401 GGCGACC7GG AATTCC7CGG CCGGA7CGAC GACCAGGTCA AGA7CGGCGG CATCCGCG7C . 

t c j. j.tACcvo»;:vj rwj ATC»jAvjAo ZZT GZT ZZ ~Z GAGvj.'-iGoCGC GCG73ACGGA G 



50521 TCCG7GCGGG AGGACCGGCG GGGCGAGAAG 77GC7GGCCG CGTACGTCGT ACCGG7GGCC 
60 50531 GGCCGGCACG GCGACGAC77 CGCCGCG7CG CTGCGCGCGG GAG7GGGCGC GCGGC7CCGC 
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5 064 1 GCCGCGC7CG 7GCCCTCCGC CGTCG7 777G GTGGAGCGAC 7GCCGAGGAC CACGAGCGGC 
D0701 .-.AGGTGGACC GGCGCGCGCT CCCCGACCCG GAGCCGGGCC CGGCGTCGAC CGGGG~GGTT 
50761 ACCCCCCGCA CCGATGCCGA GCGGACGC7G TGCCGGATCT TCCAGGAGG7 CCTCGA^g — 
30321 CCGCCGGTCG GTGCCGACGA CGAC777777 ACGC7CGGCC GGCAC7CCC7 CCT"r-~"~ 
50881 7GGGTCG7CT CCCGCATCCG CGCCGAGC7G GGTGCCGATG TCCCGC7GCG TACG.~TC~~<~ 
5094 1 C-ACGGGCGGA CGCCGGCCGC GCTCG7G7G7 GCGGCGGACG AGGCCGGCCC GGCCGCCCT'" 
=1001 7CCCCGATCG CGGCCTCGGC GGAGAACGGG CCGGCCCCCZ TCACCGCGGC ACAGGAACAG 
0 1061 A7GCTGCACT CGCACGGCTC GCTGC7CGCC GCGCCCTCCT ACACGGTCGC CCrGTACGGG 
5112i 77CCGGCTGC GCGGGCCACT CGACCG73AA GCGCTCGACG CGGCACTGAC GCGGATCGC'" 
51181 GCGCGCCACG AGCCGCTGCG GACCGGG77C CGCGATCGGG AACAGGTCGT r C GGCCGCCC 
5124 1 GCTCCGGTGC GCGCCGAGGT GGTTCC3GTG CGGGTCGGCG ACGTCGACGC CGCGGTCCGG 
51301 GTCGCCCACC GGGAGCTGAC CCGGCCG77C GACCTCGTGA ACGGGTCGTT GCTGGGTGCC 
51361 GTGCTGCTGC CGCTGGGCGC CGAGGA7CAC GTGCTGCTGC TGATGCTGCA CCACCTCGCC 
514 21 GGTGACGGAT GGTCCTTCGA CCTCC7GG7C CGGGAGTTGT CGGGGACGCA ACCGGACCTT 
514 81 CCGGTGTCCT ACACGGACGT GGCCCGG7GG GAACGGAGTC CGGCCGTGAT CGCGGGCAGG 
51541 GAGAACGACC GGGCCTACTG GCGCCGGCGG CTGGGGGGCG CCACCGCGCC GGAGCTGCCC 
51601 GCGGTCCGGC CCGGCGGGGC ACCGACC3GG CGGGCGTTCC TGTGGACGCT CAAGGACACC 
51661 GCCGTCCTGG CGGCACGCCG GGTCGCGC-AC GCCCACGACG CGACGTTGCA CGAAACCGTG 
51721 C7CGGCGCCT TCGCCCTGGT CGTGGCGGAG ACCGCCGACA CCGACGACGT GCTCGTCGCG 
51781 ACGCCGTTCG CGGACCGGGG GTACGC7GGG ACCGACCACC TCATCGGCTT CTTCGCGAAG 
51841 G7CCTCGCGC 7GCGCCTCGA CCTCGGC3GC ACGCCGTCGT TCCCCGAGGT GCTGCGCCGG 
51901 GTGCACACCG CGATGGTGGG CGCGCACGCC CACCAGGCGG TGCCCTAC7C CGCGCTGCGC 
51961 GCCGAGGACC CCGCGCTGCC GCCGGCCCCC GTGTCGTTCC AGCTCATCAG CGCGCTCAGC 
52021 GCGGAACTGC GGCTGCCCGG CATG CA. 7 A -CC GAGCCGTTCC CCGTCGTCGC CGAGACCG^C 
25 52081 GACGAGATGA CCGGCGAACT GTCGA7CAAC CTCTTCGACG ACGGTCGCAC CGTC7CCGGC 
5214 1 GCGGTGGTCC ACGATGCCGC GCTGC7CGAC CGTGCCACCG TCGACGA7TT GC7CACCCGG 
52201 G7GGAGGCGA GGCTGCG7GC CGCCGC3GGC GACCTCACCG TACGCGTCAC CGGTTACGTG 
522 61 GAAAGCGAGT AGCCA7GCCC GAGCAGGACA AGACAGTCGA GTACCTTCGC TGGGCGACCG 
52321 CGGAACTCCA GAAGACCCG7 GCGGAAC7CG CCGCGCACAG CGAGCCG77G GCGA7CG7GG 
30 52 381 GGA7GGCCTG CCGGCTGCCC GGCGGGG7CG GGTCGCCGGA GGACCTG7GG CAG77GC7GG 
524 41 AG7CCGG7GG CGACGGCATC ACCGCGT7CC CCACGGACCG GGGCTGGGAG ACCACCGCCG 
52501 ACGGTCGCGG CGGCTTCC7C ACCGGGGCGG CCGGC77CGA CGCGGCG77C TTCGGCA7CA 
52 561 GCCCGCGCGA GGCGCTGGCG ATGGACCCGC AGCAGCGCCT GGCCCTGGAG ACC7CG7GGG 
52 621 AGGCG77CGA GCACGCGGGC ATCGATCCGC AGACGCTGCG GGGCAG7GAC ACGGGGGTG7 
35 52 681 7CC7CGGCGC G7TC7TCCAG GGG7ACGGCA 7CGGCGCCGA C7TCGACGG7 TACGGCACCA 
5274 1 CGAGCAT7CA CACGAGCG7G C7CTCCGGGC GCCTCGCGTA C7TCTACGG7 C7GGAGGG7C 
52801 CGGCGGTCAC GG7CGACACG GCGTG777G7 CG7CGC7GGT GGCGC7GCAC CAGGCCGGGC 
528 61 AGTCGCTGCG C7CCGGCGAA TGCTCC-C7CG CCC7GG7CGG CGGCG7CACG G7GA7GGCC7 
52 921 .GCCGGCGGG G77CGCGGAC TTCTCCGAGC AGGGCGGCCT GGCCCCCGAC GCGCGC7GCA 
40 52 981 AGGCC77CGC GGAAGCGGC7 GACGGCACCG G7T7CGCCGA GGGGTCCGGC G7GC7GATCG 
5 304 1 7CGAGAAGCT C7CCGACGCC GAGCGC.-_-.CG GCCACCGCG7 GC7GGCGG7C GTCCGGGGTT 
53101 CCGCCGTCAA CCAGGACGG7 GCCTCCAACG GGC7G7CCGC GCCGAACGGG CCG7CGCAGG 
53161 AGCGGG7GAT CCGGCAGGCC CTGGCCAACG CCGGACTCAC CCCGGCGGAC G7GGACGCCG 
5 3221 7CGAGGCCCA CGGCACCGGC ACCAGGC7GG GCGACCCCAT CGAGGCACAG GCCG7GC7GG 
45 5 3281 CCACC7ACGG GCAGGGGCGC GACACCCC7G 7GCTGC7GGG C7CGC7GAAG TCCAACA7CG 
5334 1 GCCACACCCA GGCCGCGGCG GGCG7CGCCG G7G7CATCAA GA7GGTCC7C GCCATGCGGC 
5 3401 ACGGCACCC7 GCCCCGCACC C7GCACG7GG ACACGCCGTC CTCGCACG7C GAC7GGACGG 
5 3461 CCGGCGCCGT CGAAC7CC7C ACCGACGCCC GGCCCTGGCC CGAAACCGAC CGCCCACGGC 
5 3 521 GCGCCGGTGT CTCC7CCTTC GGCGTCAGCG GCACCAACGC CCACATCA7C C7CGAAAGCC 
50 53581 ACCCCCGACC GGCCCCCGAA CCCGCCZZGG CACCCGACAC CGGACCGC7G CCGC7GC7GC 
5364 1 7CTCGGCCCG CACCCCGCAG GCACTC3ACG CACAGGTACA CCGCC7GCGC GCG77CC7CG 
53701 ACGACAACCC CGGCGCGGAC CGGGTCGGCG TCGCGCAGAC AC7CGCCCGG CGCACCCAG7 
5 3761 7CGAGCACCG CGCCGTGCTG CTCGGCGACA CGC7CA7CAC CGTGAGCCCG AACGCCGGCC 
53821 GCGGACCGG7 GG7C77CG7C 7AC7CGGGGC AAAGCACGC7 GCACCCGCAC ACCGGGCGGC 
55 5 3881 AAC7CGCG7C CACCTACCCC GTG77CGCCG AAGCG7GGCG CGAGGCCC7C GACCACC7CG 
5394 1 ACCCCACCCA GGGCCCGGCC ACGCAC77CG CCCACCAGAC CGCGCTCACC GCGC7CC7GC 
5 4 001 GG7CC7GGGG CA7CACCCCG CACGCC-G7CA 7CGGCCAC7C CC7CGG7GAG ATCA.CCGCCG 
_ 0 6 1 _ j>wACuCCuC _-_»oTGTCC7G 7CCC7C-.3GG AZGCGGGZGZ GCTCC7CACC ACCCGCACCC 
54 121 GCC7GA7GGA CCAAC7GCCG TCGGGCGGZZ CG AT GGTC AC CGTCCTGACC AGCGAGGAAA 
60 5 4 181 AGGCACGCCA GGTGCTGCGG CCGGCCG7GG AGATCGCCGC CCTCAACGGC CCCCACTCCC 
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^24 1 7CG7GC7G7C CGGGGACGAG GAAGCCGTAC TCGAAGCCGC CCGGCAGCTC GGCATrc-" 
^f 01 ^ZZGCCTZZC GACCCGCCAC GCCGGCCACT CCGAGCGCAT GCAGCCACTC G"CG^— ^ 

• 11^ I::I:^:H T CGCC 5 GGACC ctgacgtacc accagcccca caccgccatc zzzggcgacc 

-t m ^ _ — . s_ „ G _ CGAA:AC7GG GCGCACCAGC 7CC3CGACCA AG7ACGTT7T C- oGCG" " 

3 :^ 31 - ^AGCAGTA CCCGGGCGCG ACGTTCCTCG AGATCGGCCC C AAC GAG G A ~"7CCCC3*~ 
1 ^* - 37CGACGG CGTTGCCGCC CAGACCGGTA CGCGCGACGA GG7GCGGGCG Z ~GCAC£C~G 
~4o01 ZZZTCGCGZA GGTCCACGTC CGCGGCGTCG CGATCGACTG GACGCTCGTC ~~ C3GCGGGG 
3so6i "--CGCGCGCC CGTCACGCTG CCCACGTATC CGTTCCAGCA CAAGGACTAC TGGCTGCGGC 
3-1721 CCACCTCCCG GGCCGATGTG ACCGGCGCGG GGCAGGAGCA GGTGGCGCAG C~GCTGC7~G 
IU ^731 GCGCCGCG37 CGCGCTGCCC GGCACGGGCG GAGTCGTCCT GACCGGCCGC C~GTCGC~GG 
34841 CC7CCCA7CC GTGGCTCGGC GAGCACGCGG TCGACGGCAC CGTGCTCCTG Z~CGGCGCGG 
5 4 901 CC7TCC7CGA ACTCGCGGCG CGCGCCGGCG ACGAGGTCGG CTGCGACCTG C7GCACGAAC 
5 4 961 TC37CA7CGA GACGCCGCTC GTGCTGCCCG CGACCGGCGG TG7GGCGGTC "CCGTCGAGA 
5 5021 7CGCCGAACC CGACGACACG GGGCGGCGGG CGGTCACCGT CCACGCGCGG GCCGACGGCT 
5 5081 CGGGCCTG7G GACCCGACAC GCCGGCGGAT TCCTCGGCAC GGCACCGGCA CCGGGCACGG 
5 5141 CCACGGACCC GGCACCCTGG CCGCCCGCGG AAGCCGGACC GGTCGACGTC GCCGACG7C' r 
5 5201 ACGACCGG7T CGAGGACATC GGGTACTCCT ACGGACCGGG CTTCCGGGGG CTGCGGGCCG 
5 5261 CCTGGCGCGC CGGCGACACC GTGTACGCCG AGGTCGCGCT CCCCGACGAG CAGAGCGCCG 
5 5321 ACGCCGCCZG TTTCACGCTG CACCCCGCGC TGCTCGACGC CGCGTTCCAG GCCGGCGCGC 
= 5381 TGGCCGCGC7 CGACGCACCC (ZGCGGGGCGG CCCGACTGCC GTTCTCGTTC CAGGACG7^° 
5 5441 GCATCCACGG GGCCGGGGCG ACGCGGCTGC GGGTCACGGT CGGCCGCGAC GGCGAGCGCA 
5 5501 GCACCG7CCG CATGACCGGC CCGGACGGGC AGCTGG7GGC CGTGGTCGG7 GCCGTGC7GT 
5 5561 CGGGCCCG7A CGCGGAAGGC TCCGGTGACG GCCTGC7GCG CCCGGTCTGG ACCGAGCTGC 
55621 GGATGCCCG7 CCCGTZCGCG GACGATCCGC GCGTGGAGGT CCTCGGCGCC GACCCGGGCG 
-5 5 5 681 ACGGCGACG7 TCCGGCGGCC ACCCGGGAGC TGACCGCCCG CGTCCTCGGC GCGCTCCAGC 
5 574 1 GCCACCTG7C CGCCGCCGAG GACACCACCT TGGTGGTACG GACCGGCACC GGCCCGGCCG 
5 5801 C7GCCGCCGC CGCGGGTCTG GTCCGCTCGG CGCAGGCGGA GAACCCCGGC CGCGTCGTGC 
5 58 61 TCGTCGAGGC GTCCCCGGAC ACCTCGGTGG AGCTGCTCGC CGCGTGCGCC GCGCTGGACG 
55921 AACCGCAGCT GGCCGTCCGG GACGGCGTGC TCTTCGCGCC GCGGCTGG7C CGGATGTCCG 
30 55981 ACGCCGCGCA CGGCCCGCTG TCCCTGCCGG ACGGCGACTG GCTGCTCACC CGGTCCGCCT 
5604 1 CCGGCACGTT GCACGACGTC GCGCTCATAG CCGACGACAC GCCCCGGCGG GCGCTCGAAG 
56101 CCGGCGAGGT CCGCATCGAC GTCCGCGCGG CCGGACTGAA CTTCCGCGAT GTGCTGATCG 
5 6161 CGCTCGGGAC GTACACCGGG GCCACGGCCA TGGGCGGCGA GGCCGCGGGC GTCGTGGTGG 
56221 AGACCGGGCC CGGCGTGGAC GACCTGTCCC CCGGCGACCG GGTGTTCGGC CTGACCCGGG 
35 56281 GCGGCATCGG CCCGACGGCC GTCACCGACC GGCGCTGGCT GGCCCGGATC CCCGACGGC7 
5634 1 GGAGCTTCAC CACGGCGGCG TCCGTCCCGA TCGTGTTCGC GACCGCGTGG TACGGCC7GG 
5 6401 7CGACCTCGG CACACTGCGC GCCGGCGAGA AGGTCCTCGT CCACGCGGCC ACCGGCGG7G 
564 61 TCGGCATGGC CGCCGCACAG ATCGCCCGCC ACCTGGGCGC CGAGCTCTAC GCCACCGCZA 
56521 G7ACCGGCAA GCAGCACGTC CTGCGCGCCC CCGGGCTGCC CGACACGCAC ATCGCCGAC7 
40 5 6581 C7CGGACGAC CGCGTTCCGG ACCGCTTTCC CGCGCATGGA CGTCGTCCTG AACGCGC7GA 
5664 1 CCGGCGAG7T CATCGACGCG TCGCTCGACC 7GCTGGACGC CGACGGCCGG 77CGTCGAGA 
5 6701 TGGGCCGCAC CGAGCTGCGC GACCCGGCCG CGATCGTCCC CGCCTACCTG CCGTTCGACC 
5 6761 TGCTGGAGGC GGGCGCCGAC CGCATCGGCG AGATCCTGGG CGAACTGCTC CGGCTGTTCG 
5 6821 ACGCGGGCGC GCTGGAGCCG CTGCCGGTCC G7GCCTGGGA CGTCCGGCAG GCACGCGACG 
45 5 6881 CGCTCGGC7G GATGAGCCGC GCCCGCCACA TCGGCAAGAA CGTCCTGACG CTGCCCCGGC 
5694 1 CGC7CGACCC GGAGGGCGCC GTCGTCCTCA CCGGCGGCTC CGGCACGC7C GCCGGCATCC 
5 7 001 TCGCCCGCCA CCTGCGCGAA CGGCATGTCT ACCTGCTGTC CCGGACGGCA CCGCCCGAGG 
57 061 GGACGCCCGG CGTCCACCTG CCCTGCGACG TCGGTGACCG GGACCAGCTG GCGGCGGCCC 
57121 7GGAGCGGGT GGACCGGCCG ATCACCGCCG TGGTGCACCT CGCCGGTGCG CTGGACGACG 
50 5 7161 GCACCGTCGC GTCGCTCACC CCCGAGCGTT TCGACACGGT GCTGCGCCCG AAGGCCGACG 
5724 1 GCGCCTGG7A CCTGCACGAG CTGACGAAGG AGCAGGACCT CGCCGCGTTG GTGCTCTAC7 
57 301 CGTCGGCCGC CGGCGTGCTC GGCAACGCCG GCCAGGGCAA CTACGTCGCC GCGAACGCG7 
57 361 TCC7CGACGC GCTCGCCGAG CTGCGCCACG GTTCCGGGCT GCCGGCCC7C 7CCA7CCCC7 
574 21 GGGGGCTC7G GGAGGACGTG AGCGGGCTCA CCGCGGCGCT CGGCGAAGCC GACCGGGACC 
55 5 7481 GGA7GGGGCG CAGCGGTTTC CGGGCCATCA CCGCGCAACA GGGCATGCAC CTGTACGAGG 
5 7 54Z CGGZCGGCCG CACCGGAAGT CCCGTGGTGG TCGCGGCGGC GC7CGACGAG GCGCCGGACG 
5^602 7GCCGC7GC7 GCGCGGCCTG CGGCGGACGA CCGTCCGGCG GGCCGCCGTC GGGGAG7G77 
i766I CG7CCGCCGA CCGGC7CGCC GCGCTGACCG GCGACGAGCT CGCCGAAGCG CTGCTGACGC 
57721 TCGT'CCGGGA GAGCACCGCC GCCGTGCTCG GCCAGGTGGG TGGCGAGGAC ATCCCCGCGA 
.60 5 7781 CGGCGGCG77 CAAGGACCTC GGCA7CGACT CGC7CACCGC GG7CCAGCTG GGCAACGCCG 
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573 4 1 TCACCGAGGG GACCGGTGTG CGGCTGAACG CCACGGCGG7 C7TCGAC77C CCGACCC*~Z 
5 7 901 ACGTGCTCGC CGGGAAGCTC GGCGACGAAC TGACCGGCAC CCGCGCGCCC GTCGTGCC^C 
57961 GGACCGCGGC CACGGCCGGT GCGCACGACG AGCCGCTGGC GATCGTGGGA ATGGCCTGCC 
58C2i GGCTGCGCG 3 ZGGGG7CGCG 7CACCCGAGG AGC7C7GGCA CCTCGTGGCA TCCGGCA-" 
5 53081 ACGCCA7CAZ GGAGTTCCCG ACGGACCGCG GCTGGGACGT GGACGGGATC TACGACCCGG 
5314 1 ACCCCGACGZ GATCGGCAAG ACCTTCGTCC GGCACGGTGG CTTCCTCACC GGCGCGACA- 
58 201 GCTTCGA.CGC GGCG77C7TC GGCATCAGCC CGCGCGAGGC CCTCGCGATG GACCCGCAGC 
582 61 AGCGGG7GC7 CCTGGAGACG TCGTGGGAGG CG T TCG AAA G CGCCGGCATC ACCCCGGAC~ 
5 3 321 GGAGCCGCGG CAGCGACACC GGCGTGTTCG TCGGCGCCTT CTCC7ACGGT 7ACGGCACC 3 
10 5 3 331 GTGCGGACAC CGACGGCTTC GGCGCGACCG GCTCGCAGAC CAGTGTGCTC TCCGGCCGGC 
5844 1 TGTCG7ACT7 CTACGGTCTG GAGGGTCCGG CGGTCACGGT CGACACGGCG TGTTCG7CG7 
58 501 CGCTGGTGGC GCTGCACCAG GCCGGGCAGT CGCTGCGC7C CGGCGAATGC 7CGC7CGCCC 
58 561 TGG7CGGCGG CGTCACGG7G ATGGCGTCTC CCGGCGGCTT CGTGGAGTTC TCCCGGCAGC 
58621 GCGGCCTCGC GCCGGACGGC CGGGCGAAGG CGTTCGGCGC GGGTGCGGAC GGCACGAGC7 
15 58 681 TCGCCGAGGG 7GCCGGTGTG CTGATCGTCG AGAGGCTCTC CGACGCCGAA CGCAACGG7C 
5874 1 ACACCG7CCT GGCGGTCGTC CGTGGTTCGG CGGTCAACCA GGATGGTGCC TCCAACGGGC 
58 801 TGTCGGCGCC GAACGGGCCG TCGCAGGAGC GGGTGATCCG GCAGGCCCTG GCCAACGCCG 
588 61 GGCTCACCCC GGCGGACGTG GACGCCGTCG AGGCCCACGG CACCGGCACC AGGC7GGGCG 
58 921 ACCCCATCGA GGCACAGGCG GTACTGGCCA CCTACGGACA GGAGCGCGCC ACCCCCC7GC 
20 58 981 TGCTCGGC7C GCTGAAGTCC AACATCGGCC ACGCCCAGGC CGCGTCCGGC GTCGCCGGCA 
5904- TCATCAAGA7 GGTGCAGGCC CTCCGGCACG GGGAGCTGCC GCCGACGCTG CACGCCGACG 
59101 AGCCG7CGCC GCACG7CGAC TGGACCGCCG GCGCCG7CGA ACTGC7GACG TCGGCCCGGC 
59161 CGTGGCCCGA GACCGACCGG CCACGGCGTG CCGCCGTCTC C7CG7TCGGG GTGAGCGGCA 
5 92.21 CCAACGCCCA CGTCA7CC7G GAGGCCGGAC CGGTAACGGA GACGCCCGCG GCATCGCCT7 
25 5 9281 CCGGTGACC7 TCCCCTGCTG GTGTCGGCAC GCTCACCGGA AGCGCTCGAC GAGCAGATCC 
5 9341 GCCGACTGCG CGCC7ACCTG GACACCACCC CGGACGTCGA CCGGGTGGCC GTGGCACAGA 
594 01 CGCTGGCCCG GCGCACACAC TTCGCCCACC GCGCCGTGCT GCTCGGTGAC ACCCTCATCA 
59461 CCACACCCCC CGCGGACCGG CCCGACGAAC TCGTCTTCGT CTAC7CCGGC CAGGGCACCC 
59521 AGCA7CCCGC GATGGGCGAG CAGCTCGCCG CCCCCCATCC CGTG7TCGCC GACGCC7GGC ** 
30 5 9581 ATGAAGCGCT CCGCCGCCTT GACAACCCCG ACCCCCACGA CCCCACGCAC AGCCAGCATG * 
5964 1 TGCTCTTCGC CCACCAGGCG GCGTTCACCG CCCTCCTGCG GTCCTGGGGC ATCACCCCGC t 
59701 ACGCGGTCAT CGGCCACTCG CTGGGCGAGA TCACCGCGGC GCACGCCGCC GGCATCCTGT > 
59761 CGCTGGACGA CGCG7GCACC CTGATCACCA CGCGCGCCCG CCTCATGCAC ACGCTCCCGC 
59821 CACCCGGTGC CATGGTCACC GTACTGACCA GCGAAGAGAA GGCACGCCAG GCG77GCGGC 
35 5 9881 CGGGCG7GGA GATCGCCGCC GTCAACGGGC CCCACTCCAT CGTGC7GTCC GGGGACGAGG 

5 9941 ACGCCGTGC7 CACCGTCGCC GGGCAGCTCG GCATCCACCA CCGCC7GCCC GCCCCGCACG ■* 
60001 CCGGGCAC7C CGCGCACATG GAGCCCGTGG CCGCCGAGC7 GCTCGCCACC ACCGGCGGGC 
60061 TCCGCTACCA CCC7CCCCAC ACCTCCAT7C CGAACGACCC CACCACCGC7 GAG7AC7GGG 
60121 CCGAGCAGG7 CCGCAAGCCC GTGCTGT7CC ACGCCCAGGC GCAGCAGTAC CCGGACGCCG 
40 60161 TCTTCG7GGA GA7CGGCCCC GCCCAGGACC 7C7CCCC3C7 CGTCGACGGG A7CCCGC7GC 
602 4 1 AGAACGGCAC CGCGGACGAG G7GCACGCGC 7GCACACCGC GCTCGCGCAC CTCTACGCGC 
60301 GCGG7GCCAC GC7CGACTGG <CCCCGC AT GC 7CGGGGC7GG G7CACGGCAC GACGCGGA7G 
60361 TCCCCGCGTA CGCG7TCCAA CGGCGGCACT AC7GGATCGA G7CGGCACGC CCGGCCGCA7 
604 21 CCGACGCGGG CCACCCCGTG CTGGGCTCCG GTATCGCCCT CGCCGGGTCG CCGGGCCGGG 
45 004 8 1 TGTTCACGGG T7CCG7GCCG ACCGG7GCGG ACCGCGCGG7 G77CGTCGCC GAGC7GGCGC 
6054 1 TGGCCGCCGC GGACGCGGTC GACTGCGCCA CGG7CGAGCG GCTCGACA7C GCCTCCG7GC 
60601 CCGGCCGGCC GGGCCATGCC CGGACGACCG 7ACAGACC7G GGTCGACGAG CCGGCGGACG 
60661 AGGGCCGGCG GCGG7TCACC G7GCACACCC GCACCGGCGA CGCCCCGTGG ACGCTGCACG 
6C7 21 CCGAGGGGG7 GC7GCGCCCC CA7GGCACGG CCC7GCCCGA TGCGGCCGAC GCCGAG7GGC 
50 60731 CCCCACCGGG GGCGGTGCCC GCGGACGGGC 7GCCGGG7GT GTGGCGCCGG GGGGACGAGG 
6084 1 7C7TCGCCGA GGCCGAGG7G GACGGACCGG ACGGT7TCGT GGTGCACCCC GACCTGC7CG 
■ 60901 ACGCGG7CT7 CTGCGCGGTC GGCGACGGAA GCCGCCAGCC GGCCGGATGG CGCGACC7GA 
60961 CGGTGCACGC G7CGGACGCC ACCG7ACTGC GCGCC7GCC7 CACCCGGCGC ACCGACGGAG 
61021 CCATGGGA77 CGCCGCCTTC GACGGCGCCG GCCTGCCGG7 ACTCACCGCG GAGGCGG7GA 
55 61081 CGCTGCGGGA GGTGGCG7CA CCG7CCGGCT CCGAGGAGTC GGACGGCC7G CACCGG77GG 
6114 1 AG7GGC7CGC GGTCGCCG AG GCGG7CTACG ACGG7GACC7 GCCCGAGGGA CATGTCC7GA 
61201 TCACCGCCGC CCAwCCGAC GACCCCGAGG ACATACCCAC CCGCGCCCAC ACCCGCGCCA 
61261 CCCGCG7CC7 GACC 3 GCC7G CAACACCACC TCACCACCAC C GAG 3 AC AC C C7CATCG7C3 
61321 ACACCACCAC CGACCCCGCC GGCGCCACCG TCACCGGCC7 CACCCGCACC GCCCAGAACG . 
60 613S1 AACACCCCCA CCGCA7CCGC C7CA7CGAAA CCGACCACCC CCACACCCCC C7CCCCC7GG 
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CCCAACTCGC 
CCCACCTCAC 
ACGCCATCAT 
ACCACCCCCA 
ACCTCCCCTC 
AACCCCTCAC 
TCACCCCCGA 
ACCACCTCAC 
TCCTCGGCAG 
CCACCCACCG 
CCACCAGCAC 
GTTTCCTCCC 
GCGAGGACTT 
CGCCCATCCT 
TCGCCCAGCG 
TCTCGGACGC 
CGACGTTCAA 
CGGAGGCGAC 
TCCTCGCCGC 
CGGCACGGAC 
GCGGGGTCGC 
CCGAGTTCCC 
CCCCCGGCAA 
CCGCGTTCTf 
TCCTCGAAAC 
GCAGCGACAC 
TGGGCGGGTT 
TCTTCGGCAT 
CCCTGCACCA 
GTGTCACGGT 
CCCCCGACGG 
GCGCCGGCGT 
TCGCGGTCGT 
CCAACGGCCC 
CCGCCGACGT 
AGGCACAGGC 
CGGTCAAGTC 
TGGTCATGGC 
CGCATGTGGA 
ACGCGGGACG 
ACGTGATCCT 
TGCCGTTGCC 
AGGGGTATCT 
GTGCTGTCTT 
TGGATCAGCC 
GTGTGGAGTT 
CGTTGTTGCC 
AGCGGGTGGA 
GGCAGGCCCA 
CGGCGTGCGT 
GCCAGGTCAT 
CCGGTGAGGT 
CAGTCGTGGC 
GCGTGCGAGT 
TCGAGGACGA 
GGTGGTCGAC 
GGAACCTGCG 
TCGTGGAGTG 
CGTCGTTGCG 
GGACCCTGGG 



CACCCTCGAC 
CCCCCTCCAC 
CATCACCGGC 
CACCTACCTC 
CGAGGTCGGC 
CGCCATCTTC 
CCGCCTCACC 
CCAAAACCAA 
CCCCGGACAA 
CCACACCCTC 
CCTCACCGGA 
GATCACGGAC 
CGTCATGGCC 
GAGCGGCCTG 
GCTCGCCGAG 
CACGGCCGCC 
GGACCTCGGC 
CGGGCTGCGG 
CAAGCTCCGC 
CCACCACGAC 
CTCGCCGGAG 
CACCGACCGC 
GACCTACGTC 
CGGCATCAGC 
CTCCTGGGAG 
CGGCGTGTTC 
CGGCGCCACC 
GGAGGGCCCG 
GGCGGCACAG 
GATGCCCACC 
CCGTTGCCAG 
TCTTGTGCTG 
CCGCTCCTCC 
CTCCCAGCAG 
GGACGTGGTG 
CATCATCGCG 
GAACATCGGA 
GATGCGCCAC 
CTGGACCGAG 
CCCGCGCCGC 
TGAGGGTGTT 
GGTGTCGGCT 
GCGCGGGAGT 
CGGTCACCGT 
GCGTACGGTG 
GATGGACCGT 
GCACACGGGC 
GGTGGTCCAG 
CGGGGTCGTA 
GGCCGGGGCC 
CGCGGCGCGA 
CGGTCTGGTC 
CGGCGAGCCG 
GCGTCGTATC 
ACTCGCTGAG 
CGTGGACAGC 
TCGCCCCGTC 
CAGCGCCCAT 
CACCGGTGAC 
CGCGGCAGTG 



CACCCCCACC 
ACCACCACCC 
GGCTCCGGCA 
CTCTCCCGCA 
GACCCCCACC 
CACACCGZCG 
ACCG7CC7CC 
CCCCTCACCC 
GGAAACTACG 
GGCCAACCCG 
CAACTCGACG 
GACGAGGGCA 
GCCGCGATGG 
CGCAGGAGCG 
CTGCCCGACG 
GTGCTCGGCC 
ATCGACTCGC 
CTGAGTGCCA 
ACCGATCTGT 
GAGCCACTCG 
GACCTGTGGC 
GGCTGGGACA 
CGGCACGGCG 
CCGCGCGAGG 
GCGTTCGAGA 
ATGGGCGCGT 
GCCACGCAGA 
GCCGTCACCG 
GCGCTGCGGA 
CCGCTGGGCT 
GCCTTCGCGG 
GAGCGGCTCT 
GCCGTCAACC 
CGCGTCATCC 
GAGGCCCACG 
ACCTACGGCC 
CACACCCAGA 
GGCATCGCGC 
GGTGCGGTGG 
GCGGGCGTGT 
CCCGGGCCGT 
CGGAG7GAGG 
GTGGATGTGG 
GCGGTACTGC 
TTCGTCTTTC 
TCTGCGGTGT 
TGGGATGTGC 
CCGGCCAGCT 
CCCGACGCGG 
CTCAGCCTTG 
CTGGCCGGGC 
GAGGGCGTGT 
TCGGCGGTGG 
GCCGTCGACT 
GTACTGAAGG 
GCCTGGGTGA 
GCGCTGGACG 
GGGGTG<~T GG 
GGCGGCTGGG 
GACTGGGACA 



TCCGCCTCAC 
CACCCACCAC 
CCCTCGCCGG 
CC7CACCCCC 
AACTCGCCAC 
CCACCCTCGA 
ACCCCAAAGC 
ACTTCGTCCT 
CCGCCGCCAA 
CCACCTCCAT 
ACGCCGACCG 
TGCGCCTCTA 
ACCCGGCACA 
CGCGGCGCGT 
CCGACCGCGG 
ACGCCGACGC 
TCACCGCGAT 
CGCTGGTGTT 
TCGGCACGGC 
CGATCGTCGG 
AGCTCGTGGC 
TCGACCGGCT 
GCTTCCTCGC 
CACGGGCCAT 
ACGCGGGCAT 
TCTCCCATGG 
ACAGCGTGCT 
TCGACACCGC 
CTGGAGAATG 
ACGTCGAGTT 
AAGGCGCCGA 
CCGACGCCGA 
AGGACGGCGC 
GCCAGGCCCT 
GCACCGGAAC 
AGGACCGCGA 
CCACCGCCGG 
CGAAGACACT 
AACTGCTCAC 
CGTCGCTCGG 
CGCGTGTGGA 
CGAGTCTGCG 
CCGCGGTCGC 
TGGG7GATGC 
CCGGGCAGGG 
TCGCGGCTCG 
GGGAGATGTT 
GGGCGGTCGC 
TGATCGGACA 
AGGACGCCGC 
GGGGAGCGAT 
GGATCGCGGC 
AGGACGTGGT 
ACGCCTCCCA 
GAGTTGCAGG 
CCGAGCCGGT 
CGGCGGTGGC 
TGCCGGCG AT 
AGCGATGGCT 
CGGTGGTCGA 



CCACCACACC 
CACCCCCCTC 
CATCCTCGCC 
CGACGCCACC 
CACCCTCACC 
CGACGGCATC 
CAACGCCGCC 
CTACTCCAGC 
CGCCTTCCTC 
CGCCTGGGGC 
GGACCGCATC 
CGAGGCGGCC 
GCCGATGACC 
CGCCCGTGCC 
GGCGGCGGTG 
CTCCGAGATC 
CGAGCTGCGC 
CGACCACCCG 
CGTGCCCACG 
CATGGCGTGC 
GTCCGGCACC 
GTTCGACCCG 
CGAGGCCGCC 
GGACCCGCAG 
CGTGCCGGAC 
GTACGGCGGC 
CTCCGGCCGG 
CTGCTCGTCG 
CTCGCTGGCG 
CTGCCGCCAG 
CGGCACGAGC 
GCGCAACGGA 
CTCCAACGGC 
CGACAAGGCC 
CCCGCTGGGC 
CACACCGCTC 
TGTCGCCGGC 
GCACGTGGAC 
CGAGGCGAGG 
TATCAGCGGT 
GCCGTCTGTT 
GGGGCAGGTG 
GCAGGGGTTG 
CCGGGTGATG 
TGCTCAGTGG 
TATGGAGGAG 
GGCGGGGCCG 
GGTCAGCCTG 
CTCCCAGGGC 
CCGCGTGGTG 
GGCTTCGGTG 
GCGTAACGGC 
GACGCGGTAT 
CACGCCCCAC 
GAAGGCCGCG 
GGATGAGAGT 
GGAGCTGGAC 
GGAACAGGCC 
GACGGCGTTG 
ACCGGTGCCA 



CTCCACCACC 
AACCCCGAAC 
CGCCACCTGA 
GCGGGGAGCC 
CACATCCCCC 
CTCCACGCCC 
TGGCACCTGC 
GCCGCCGCCG 
GAC3CCC7CG 
ATGTGGCACA 
CGCCGCGGCG 
GTCGGCTCCG 
GGCTCCGTAC 
GGGCAGACGT 
ACCACCCTCG 
GCGCCGACCA 
AACCGGCTCG 
ACACCTCGGG 
CCCGCGCGGA 
CGACTGCCCG 
GACGCGATCA 
GACCCGGACG 
GGCTTCGATG 
CAGCGCGTCA 
ACGCTGCGCG 
GGCG7CGACC 
TTGTCGTACT 
TCGCTGGTCG 
GTCGCCGGCG 
CGGGGACTCG 
TTCTCGGAGG 
CACACCGTCC 
ATCTCCGCAC 
GGGCTCGCCC 
GACCCGATCG 
TACCTCGGTT 
GTCATCAAGA 
GAGCCGTCGT 
CCG7GGCCCG 
ACGAACGCCC 
GACGGGTTGG 
GAGCGGC7GG 
GTGCGTGAGC 
GG7GTGGCGG 
G7GGGCA7GG 
TGTGCGCGGG 
GATG7GGCGG 
GCCGCACTGT 
GAGATCGCGG 
GCC7TGCGCA 
GCAT7GCCGG 
CCCGCCTCGA 
GAGACCGAAG 
G7GGAAGCCA 
TCGG7GGCGT 
TAC7GGTACC 
GCG7CCGTCT 
C A w.-i-— oGTGG 
GCGCAGGCG7 
GGGCGGCTGC 
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65041 TCGATCTGCC CACCTACGCG TTCGAGCGCC GGCGCTACTG GCTGGAAGCG GCCGGTGC^ 
6 5101 CCGACCTGTC CGCGGCCGGG CTGACAGGGG CAGCACATCC CATGCTGGCC GCC-— ACGG 
65161 CACTACCCGC CGACGACGGT GGTGTTGTTC TCACCGGCCG GATCTCGTTG CGCACGCATC 
; 5 5 221 CCTGGCTGGC TGATCACGCG GTGCGGGGCA CGGTCCTGC7 GCCGGGCACG GCG — ""G~GG 
3 6 5 2 31 AGCTGGTCAT CCGGGCCGGT GACGAGACCG GTTGCGGGAT AGTGGATGAA CTGG7CATCG 
6 5 2'ii AATCCCCCC7 CGTGGTGCCG GCGACCGCAG CCGTGGATCT GTCGGTGACC GTGGA ^GGAG 
6 5 401 CTGACGAGGC CGGACGGCGG CGAGTGACCG TCCACGCCCG CACCGAAGGC ACCG3CAGCT 
654 61 GGACCCGGCA CGCCAGCGGC ACCCTGACCC CCGACACCCC CGACACCCCC AACG2TTCCG 
65521 GTGTTG "^^G TGCGGAGCCG TTCTCGCAGT GGCCACCTGC CACTGCCGCG GCC^~CGACA 
10 65 581 CCTCGGAGTT CTACTTGCGC CTGGACGCGG TGGGCTACCG GTTCGGACCC A7G77CCGCG 
6564 1 GAATGCGGGC TGCCTGGCGT GATGGTGACA CCG7GTACGC CGAGGTCGCG CTGC2GGAGG 
6 5 701 ACCGTGCCGC CGACGCGGAC GGTTTCGGCA TGCACCCGGC GCTGCTCGAC GCGGCCTTGC 
657 61 AGAGCGGCAG CCTGCTCATG CTGGAATCGG ACGGCGAGCA GAGCGTGCAA CTGCCGTTC T 
65821 CCTGGCACGG CGTCCGGTTC CACGCGACGG GCGCGACCAT GCTGCGGGTG GCGGTCGTAC 
15 6 5881 CGGGCCCGGA CGGCCTCCGG CTGCATGCCG CGGACAGCGG GAACCGTCCC GTCGCGACGA 
6594 1 TGGACGCGCT CGTGAGCCGG TCCCCGGAAG CGGACCTCGC GCCCGCCGAT CCGATGCTGC 
66001 GGGTCGGGTG GGCCCCGGTG CCCGTACCTG CCGGGGCCGG TCCGTCCGAC GCGGACGTGC 
66061 TGACGCTGCG CGGCGACGAC GCCGACCCGC T CGGGGAG AC CCGGGACCTG ACCACCCGTG 
66121 TTCTCGACGC GCTGCTCCGG GCCGACCGGC CGGTGATCTT CCAGGTGACC GGTGGCCTCG 
66181 CCGCCAAGGC GGCCGCAGGC CTGGTCCGCA CCGCTCAGAA CGAGCAGCCC GGCCGCTTCT 
6624 1 7CCTCGTCGA AACGGACCCG GGAGAGGTCC TGGACGGCGC GAAGCGCGAC GCGATCGCGG 
66301 CACTCGGCGA GCCCCATGTG CGGCTGCGCG ACGGCCTCTT CGAGGCAGCC CGGC7GA7GC 
66361 GGGCCACGCC GTCCCTGACG CTCCCGGACA CCGGGTCGTG GCAGCTGCGG CCG7CCGCCA 
664 21 CCGGTTCCCT CGACGACCTT GCCG7CG7CC CCACCGACGC CCCGGACCGG CCGCTCGCGG 
25 664 81 CCGGCGAGG7 GCGGA7GGCG GTACGCGCGG CGGGCCTGAA CTTCCGGGAT GTCACGGTCG 
6 6541 CGCTCGGTGT GGTCGCCGAT GCGCGTCCGC TCGGCAGCGA GGCCGCGGGT GTCG7CCTGG 
66601 AGACCGGCCC CGGTGTGCAC GACCTGGCGC CCGGCGACCG GGTCCTGGGG ATGCTCGCGG 
66661 GCGCCTTCGG ACCGGTCGCG ATCACCGACC GGCGGCTGCT CGGCCGGATG CCGGACGGCT 
66721 GGACGTTCCC GCAGGCGGCG TCCGTGATGA CCGCGTTCGC GACCGCGTGG TACGGCCTGG 
30 6 6781 7CGACC7GGC CGGGCTGCGC CCCGGCGAGA AGGTCCTGAT CCACGCGGCG GCGACCGGTG 
6684 1 7CGGCGCGGC GGCCGTCCAG ATCGCGCGGC ATCTGGGCGC GGAGGTGTAC GCGACCACCA 
66901 GCGCCGCGAA GCGCCATCTG GTGGACCTGG ACGGAGCGCA TCTGGCCGAT TCCCGCAGCA 
6 6 961 CCGCGTTCGC CGACGCGTTC CCGCCGGTCG ATGTCGTGCT CAACTCGCTC ACCGGTGAAT 
67021 TCCTCGACGC GTCCGTCGGC CTGCTCGCGG CGGGTGGCCG GTTCATCGAG ATGGGGAAGA 
35 6 7 081 CGGACATCCG GCACGCCGTC CAGCAGCCGT TCGACCTGAT GGACGCCGGC CCCGACCGGA 
6714 1 TGCAGCGGAT CATCGTCGAG CTGCTCGGCC TGTTCGCGCG CGACGTGCTG CACCCGCTGC 
67 201 CGGTCCACGC CTGGGACGTG CGGCAGGCCC GGGAGGCGTT CGGCTGGATG AGCAGCGGGC 
672 61 GTCACACCGG CAAGCTGGTG CTGACGGTCC CGCGGCCGCT GGATCCCGAG GGGGCCGTCG 
67 321- TCATCACCGG CGGCTCCGGC ACCCTCGCCG GCATCCTCGC CCGCCACCTG GGCCACCCCC 
40 67381 ACACCTACCT GCTCTCCCGC ACCCCACCCC CCGACACCAC CCCCGGCACC CACCTCCCCT 
674 4 1 GCGACGTCGG CGACCCCCAC CAACTCGCCA CCACCCTCGC CCGCATCCCC CAACCCCTCA 
67 501 CCGCCGTCTT CCACACCGCC GGAACCCTCG ACGACGCCCT GCTCGACAAC CTCACCCCCG 
67 561 ACCGCGTCGA CACCGTCCTC AAACCCAAGG CCGACGCCGC CTGGCACCTG CACCGGCTCA 

67 621 CCCGCGACAC CGACCTCGCC GCGTTCGTCG TCTACTCCGC GGTCGCCGGC CTCATGGGCA 
45 6 7 631 GCCCGGGGCA GGGCAACTAC GTCGCGGCGA ACGCGTTCCT CGACGCGCTC GCCGAACACC 

6774 1 GCCGTGGGCA AGGGCTGCGC GCGCAGTCCC TCGCATGGGG CATGTGGGCG GACGTCAGCG 
67801 CGCTCACCGC GAAACTCACC GACGCGGACC GCCAGCGCAT CCGGCGCAGC GGATTCCCGC 
6 78 61 CGTTGAGCGC CGCGGACGGC ATGCGGCTGT TCGACGCGGC GACGCGTACC CCGGAACCGG 
67921 TCGTCGTCGC GACGACCGTC GACCTCACCC AGCTCGACGG- CGCCGTCGCG CCGTTGCTCC 
50 6 7 981 GCGGTCTGGC CGCGCACCGG GCCGGGCCGG CGCGCACGGT CGCCCGCAAC GCCGGCGAAG 
68041 AGCCCCTGGC CGTGCGTCTT GCCGGGCGTA CCGCCGCCGA GCAGCGGCGC ATCA7GCAGG 
68101 AGGTCGTGCT CCGCCACGCG GCCGCGGTCC TCGCGTACGG GCTGGGCGAC CGCGTGGCGG 

68 161 CGGACCGTCC GTTCCGCGAG CTCGGTTTCG ATTCGCTGAC CGCGGTCGAC CTGCGCAATC 
68221 GGCTCGCGGC CGAGACGGGG CTGCGGCTGC CGACGACGCT GGTGTTCAGC CACGCGACGG 

55 6S281 CGGAGGCGCT CACCGCCCAC CTGCTCGACC TGATCGACGC TCCCACCGCC CGGATCGCCG 
6834 1 GGGAGTCCCT GCCCGCGGTG ACGGCCGCTC CCGTGGCGGC CGCGCGGGAC CAGGACGAGC 
68401 CGATCGCCAT CGTGGCGATG GCGTGCCGGC TGCCCGGTGG TGTGACGTCG CCCGAGGACC 
684 61 TGTGGCGGCT CGTCGAGTCC GGCACCGACG CGATCACCAC GCCTCCTGAC GAC--GCGGCT 
68 521 GGGACGTCGA CGCGCTGTAC GACGCGGACC CGGACGCGGC CGGCAAGGCG TACAACCTGC 

60 63 581 GGGGCGGTTA CCTGGCCGGG GCGGCGGAGT TCGACGCGGC GTTCTTCGAC ATCAGTCCGC 
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b3 641 GCGAAGCGCT CGGCATGGAC CCGCAGCAAC GCCTGCTGCT 
68 701 TCGAGCGCGG CCGGATCAGT CCGGCGTCGC TCCGCGGCCG 
637 61 GTGCGGCCGC GCAGGGCTAC GGGCTGGGCG CCGAGGACAC 
63S:i GTGGTTCCAC GAGCC7GC7G TCCGGACGGC TCGCG7ACG7 
68881 CGGTCACCGT GGACACGGCG TGCTCGTCGT CTCTGGTCGC 
68 94 1 GGCTGCGCCT GGGCGAGTGC GAACTCGCTC TGGCCGGAGG 
6 9001 CGGCCGCGTT CGTGGAGTTC TCCCGCCAGC GCGGGCTCGC 
6 9061 CGTTCGGCGC GGGCGCGGAC GGCACGACGT GGTCCGAGGG 
69121 AACGCCTC7C CGACGCCGAG CGGCTCGGGC ACACCGTGCT 
69181 CCGTCACGTC CGACGGCGCC TCCAACGGCC TCACCGCGCC 
6924 1 GGGTCATCCG GAAGGCGCTC GCCGCGGCCG GGCTGACCGG 
69301 AGGGGCACGG CACCGGCACC 'CGGCTCGGCG ACCCGGTCGA 
693 61 CGTACGGGCA GGACCGTCCG GCACCGGTCT GGC7GGGC7C 
6 9421 ATGCCACGGC CGCGGCCGGT GTCGCGGGCG TCATCAAGAT 
69481 GCACGATGCC GCGGACGCTG CATGTGGAGG AGCCC7CGCC 
6954 1 GACAGGTGTC CCTGCTCGGC TCCAACCGGC CCTGGCCGGA 
69601 CGGCCGTCTC CGCGTTCGGG CTCAGCGGGA CGAACGCGCA 
69661 GTCCGGCGCC CGTGGCGTCC CAGCCGCCCC GGCCGCCCCG 
69721 CGTGGGTGCT CTCCGCGCGG ACTCCGGCCG CGCTGCGGGC 
69781 ACCACCTCGC GGCGGCACCG GACGCGGATC CGTTGGACAT 
69841 GCCGCGCCCA GTTCGCCCAC CGTGCCGCGG TCGTCGCCAC 
6 9901 CCGCGCTCGA CGGCCTCGCG GACGGCGCGG AGGCGCCCGG 

6 9961 AGGAGCGGCG CGTCGCCTTC CTCTTCGACG GCCAGGGCGC 
70021 GCGAGCTCCA CCGCCGGTTC CCCGTCTTCG CCGCCGCG7G 

7 0081 TCGGCAAGCA CCTCAAGCAC TCCCCCACGG ACGTCTACCA 
7014 1 CCCATGACAC CCTGTACGCC CAGGCCGGCC TGTTCACGCT 
7 0201 TGCTGGAGCA CTGGGGGGTG CGGCCGGACG TGCTCGTCGG 
7 0261 CCGCGGCGTA CGCGGCGGGG GTGCTCACCC TGGCGGACGC 
7 0 321 GGGGGCGGGC GCTGCGGGCG CTGCCGCCCG GGGCGATGCT 
7 0381 CGGAGGTCGG CGCCCGCACG GATCTGGACA TCGCCGCGGT 
704 4 1 TGCTCGCCGG TTCGCCGGAC GATGTGGCGG CGTTCGAACG 
7 0501 GGCGCACGAA ACGGCTCGAC GTCGGGCACG CGTTCCACTC 
7C561 7CGACGCC77 CCGTACGGTG CTGGAGTCGC TCGCGTTCGG 
7 0621 TGTCCACGAC GACGGGCCGG GACGCCGCGG ACGACCTCAT 
70681 GCCATGCGCG TCGGCCGGTG CTGTTCTCGG ATGCCGTCCG 
7074 1 TCACCACGTT CGTGGCCGTC GGCCCCTCCG GCTCCCTGGC 
70801 CCGGGGAGGA CGCCGGGACC T AC CACGCGG TGCTGCGCGC 
7 0861 CGGCGCTGAC CGCCCTCGCC GAGCTGCACG CCCACGGCGT 
7 0 921 TACTGGCCGG TGGCCGGCCA GTGGACCTTC CCGTGTACGC 
7 0981 GGCTGGCCCC GGCCG7GGCG GGGGCGCCGG CCACCG7GGC 
71041 AG7CCGAGCC GGAGGACCTC AJGCGTCGCCG AGATCGTCCG 
71101 7CGGCG7CAC GGACCCCGCC GACGTCGATG CGGAAGCGAC 
71161 AC7CACTGGC GGTGCAGCGG CTGCGCAACC AGCTCGCC7C 
71221 CGGCGGCCG7 CCTG77CGAC CACGACACCC CGGCCGCGC7 
7 1281 GGATCGAGGC CGGCCAGGAC CGGATCGAGG CCGGCGAGGA 
7 1341 7C7CGCTCC7 GGAGGAGA7G GAG7CGC7CG ACGCCGCGGA 
7 1401 CGGAGCG7GC GGCCA7CGCC GA7C7GC7CG ACAAGC7CGC 
7 14 61 GA7GAGCACC GA7ACGCACG AGGGAACGCC GCCCGCCGGC 
7 1521 GGACGGTCAC CGCGCCA7CC 7GGAGAGCGG CACGG7GGG7 
7 1581 CAAGCAC7GG C7GG7CGCCG CCGCCGAGGA CG7CAAGC7G 
7 1641 CACCTCCGCC GCGCCG7CCG AGATGCTGCC CGACCGGCGG 
7 1701 GGAC7CACCG GAGCACAACC GC7ACCGCCA GAAGA7CGCG 
7 I 7 c i GGCCCGCAAG CGGGAGGAC7 7CG7CGCCCA GGCCGCCGAC 
7 1821 GGCCGCGGGA CCCGGCACCG ACC7CATCCC CGGG7ACGCC 
7 1881 CA7CAACGCG C7G7ACGGGC TCACCCCTGA GGAGGGGGCC 
71941 CG ACA7CACC GGC7CGGCCG A7C7GGACAG CG7CAAGACG 
7 2001 GCACGCGC7G CGGC7GG7CC GCGCGAAGCG 7GACGAGCGG 
7 2061 GC7GGCCTCG GCCGACGACG GCGAGA7C7C GC7CAGCGAC 
72121 CGCGACGC7G C7GT7CGCCG GCCACGAC7C GG7GCAGCAG 
72181 CGCAC7GC7C AGCCACCCCG AGCAGCAGGC GGCGC7GCGC 



CGAAACGGCG 
GGAGG7CGGC 
CGAGGGCCAC 
GC7CGGGC7G 
GCTGCATC7G 
GG7C7CCG7A 
GGCCGACGGG 
CG7GGGCG7G 
CGCCG7CG7C 
GAACGGGC7C 
CGCCGACG7G 
GGCGGACGCG 
GC7GAAG7CG 
GG7GCAGGCG 
CGCCG7CGAC 
CGACGAGCG7 
CG7CATCCTG 
TGAGGAGTCC 
CCAGGCGGCC 
CGGGTACGCG 
CACCCCGGAC 
AGTCGTCACC 
CCAGCGCGCC 
GGACGAGGTC 
CGGCGAACAC 
CGAAGTGGCG 
GCACTCCGTC 
GACGGAGTTG 
CGCCG7CGAC 
CAACGGCCCG 
GGAG7GGTCG 
CCGGCACGTC 
CGCGGCGCGG 
AACGCCCGCG 
GGAGCTGGCC 
GTCGGCCGCG 
CCGGACCGG7 
CCCGGTCGAC 
G77CCAGCAC 
GGACACCGGG 
7CGGCGCACC 
GTTCTTCGCG 
GGCAACCGGG 
CACCGCG77C 
CGACGACGCG 
CATCGCGGCG 
CCATACC7GG 
CGCTGCCCA7 
TCG77CGACC 
G7CACCAACG 
CCCGGCTGG7 
GGGGAC77CA 
GCC7GCC7GG 
AAGCGGC7GC 
G7GC7GGAGG 
C7GACCGACG 
GGCGAGGACC 
GACGAGGCGA 
A7GG7CGGC7 
GCGCGCCCGG 



7GGGAGGCGA 
GTC7A7G7CG 
GCGATCACCG 
j^uo'j C C G G c 
GCG7GCCAGG 
CTGAG77CGC 
CGC7GCAAG7 
CTCG7AC7GG 
CGCGGCAGCG 
7CGCAGCAGC 
GACG7CG7CG 
C7GC7CGCGA 
AACA7CGGAC 
A7CGGCGCGG 
7GGAGCACCG 
CCGCGCCGGG 
GAACAGCACC 
CAGCCGC7GC 
CGGCTGCGCG 
CTGGCCACCA 
GGATTCCG7G 
GGGACCGC7C 
GGAA7GGGGC 
TCCGACGCGT 
GGCGCTC7CG 
CTGC7GCGGC 
GGCGAGG7GA 
A7CGTGGCCC 
GGAAGCCCGG 
7CCGCCG7GG 
GCGGCCGGGC 
GACGG7GCGC 
C7GCCGG7GG 
CAC7GGC7GC 
GACCGCGGCG 
GCGGAGAGGG 
GAGGAGACCG 
CTGGCCGCGG 
CG7TCC7AC7 
GGTCCGGCGG 
GCGGCGC7GC 
CTCGG777CG 
CTGGACC7GC 
CTCCAGGACC 
CCCACCG7GC 
ACGCCGGCCC 
AAGGAC7ACC 
TCGCGATCCA 
TGT7CGGCGT 
ATCCGCGG77 
7C7CCGGGA7 
CAC7GCGCCC 
ACGACATCGA 
CC7CCC7CGT 
CACGGA7GCG 
ACT7C77CGG 
7GCTGCACCG 
CGGGC\j7'-;7T 
ACTGCCTCTA 
AGC7GGTCGA 
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7 2241 CAACGCGGTC GAGGAGATGC TCCGTTTCCT GCCCGTCAAC CAGATGGGCG TACCGCGCG^ 
7 2 301 CTGTGTCGAG GACGTCGATG TGCGGGGCGT GCGCATCCGT GCGGGCGACA ACGTGATCCC 
7 2 361 GCTCTACTCG ACGGCCAACC GCGACCCCGA GGTGTTCCCG CAGCCCGACA CCTTCGATGT 
724 21 GACGCGCCCG CTGGAGGGCA ACTTCGCGTT CGGCCACGGC ATTCACAAGT GT<~^ GGCC< 
5 724 81 GCACATCGCC CGGGTGCTCA TCAAGGTCGC CTGCCTGCGG TTGTTCGAGC GTTTCCCGGA 
72 541 CGTCCGGCTG GCCGGCGACG TGCCGATGAA CGAGGGGCTC GGGCTGTTCA GCCCGGCCGA 
7 2 601 GCTGCGGGTC ACCTGGGGGG CGGCATGAGT CACCCGGTGG AGACGTTGCG GTTGCCGAAC 
7 2 661 GGGACGACGG TCGCGCACAT CAACGCGGGC GAGGCGCAGT TCCTCTACCG GGAGATCTTC 
72 721 ACCCAGCGCT GCTACCTGCG CCACGGTGTC GACCTGCGCC CGGGGGACGT GGTGTTrGAC 
10 7 2 781 GTCGGCGCGA ACATCGGCAT GTTCACGCTT TTCGCGCATC TGGAGTGTCC TGGTGTGACC 
728 41 GTGCACGCCT TCGAGCCCGC GCCCG7GCCG TTCGCGGCGC TGCGGGCGAA CGTGACGCGG 
7 2 901 CACGGCATCC CGGGCCAGGC GGACCAGTGC GCGGTCTCCG ACAGCTCCGG CACCCGGAAG 
7 2 961 ATGACCTTCT ATCCCGACGC CACGC7GATG TCCGGTTTCC ACGCGGATGC CGCGGCCCGG 
73021 ACGGAGCTGT TGCGCACGCT CGGCCTCAAC GGCGGCTACA CCGCCGAGGA CGTCGACACC 
15 7 3081 ATGCTCGCGC AACTGCCCGA CGTCAGCGAG GAGATCGAAA CCCCTGTGGT CCGGCTCTCC 
7 314 1 GACGTCATCG CGGAGCGCGG TATCGAGGCC ATCGGCCTGC TGAAGGTCGA CGTGGAGAAG 
7 3201 AGCGAACGGC AGGTCTTCGC CGGCCTCGAG GACACCGACT GGCCCCGTAT CCGCCAGGTC 
7 32 61 GTCGCGGAGG TCCA.CGACAT CGACGGCGCG CTCGAGGAGG TCGTCACGCT GCTCCGCGGC 
7 3321 CATGGCTTCA CCGTGGTCGC CGAGCAGGAA CCGCTG7TCG CCGGCACGGG CATCCACCAG 
20 7 3381 GTCGCCGCGC GGCGGGTGGC CGGCTGAGCG CCGTCGGGGC CGCGGCCGTC CGCACCGGCG 
7 3441 GCCGCGGTGC GGACGGCGGC TCAGCCGGCG TCGGACAGTT CCTTGGGCAG TTGCTGACGG 
7 3501 CCCTTCACCC CCAGCTTGCG GAACACGTTG GTGAGGTGCT GTTCCACCGT GC7GGAGGTG 
7 3561 ACGAACAGCT GGC7GGCGA7 C7CC77G77G G7GCGCCCGA CCGCGGCG7G CGACGCCACC 
73621 CGCCGC7CCG CC7CGG7CAG CGA7G7GA7C CGC7GCGCCG GCGTCACG7C C7GGG7GCCG 
25 7 3681 7CCGCG7CCG AGGAC7CCCC ACCGAGCCGC CGGAGGAGCG GCACGGC7CC GCAC7GGG7C 
7 3741 GCGAGG7GCC G7GCGCGGCG GAACAG7CCC CGCGCACGGC 7G7GCCGCCG GAGCA7GCCG 
7 3801 CACGC77CGC CCA7G7CGGC GAGGACGCGG GCCAGC7CG7 ACTGG7CGCG GCACATGATG 
7-38 61 AGCAGATCGG CGGCC7CG7C GAGCAG77CG A7CCGC77GG CCGGCGGAC7 G7AGGCCGCC 
7 3 921 TGCACCCGCA GCG7CA7CAC CCGCGCCCGG GACCCCA7CG GCCGGGACAG C7GC7CGGAG 
30 73981 ATGAGCC7CA GCCCC7CG7C ACGGCCGCGG CCGAGCAGCA GAAGCGCT7C GGCGGCG7CG 
74 04 1 ACCCGCCACA GGGCCAGGCC CGGCACGTCG ACGGACCAGC G7CGCA7CCG C7CCCCGCAG 
74101 7CCCGGAACG CG77G7ACGC CGCCCGG7AC CGCCCGGCCG CGAGA7GG7G 77GCCCACGG 
74 161 GCCCAGACCA TGTGCAG7CC GAAGAGGC7G 7CGGAGG7C7 CC7CCGGCAA CGGC7CGGCG 
7 4 221 AGCCACCGC7 CCGCCCGG7C CAGG7CGCCC AG7CGGA7CG CGGCGGCCAC GG7GC7GCTC 
35 7 4 281 AGCGGCAATG CGGCGGCCA7 CCCCCAGGAG GGCACGACCC GGGGGGCGAG CGCGGCCTCG 
7 4 341 CCGCA77CGA CGGCGGCGGT CAGGTCGCCG CGGCGCAGCG CGGCC7CGGC GCGGAACCCC 
74 4 01 GCGTGGACCG CC7CG7CGGC CGGGG7CCGC ATG77G7CG7 CACCGGCCAG C77G7CGACC 
74 4 61 CAGGAC7GGA CGGCA7CGG7 G7CC7CGGCG 7AGAGCAGGG CCAGCAACGC CATCA7GG7C 
7 4 521 G7GG7CCGG7 CCG7CG7GAC CCGGGAG7GC TGGAGCACG7 AC7CGGC777 GGCC7CGGCC 
40 7 4 581 7G77CGGACC AGCCGCGCAG CGCG77GC7C AGGGCC77G7 CGGCGACGGC GCGG7GCCGG 
7 4 641 ACGGC7CCGG AAAACGAGGC GACC7CG7CC 7CGGCCGGCG GATCGGCCGG ACGCGGCGGA 
7 4 701 7CGGCCGCGC CGGGA7AGA7 CAGCGCGAGG GACAGG7CCG CGACGCGCAG G7GCGCCCGG 
74 7 61 CCC7GCTCGC 7CGGGGCGGC GGAGCGC7GG GCCGCCAGGA CC7CGGCGGC C7CGCCCGGC 
74 821 CGCCCGTCCA 7CGCCAGCCA GCAGGCGAGC GACACGGCG7 GC7CGCTGGA GAGGAGCCG7 
45 7 4 881 7CCCGCGACG CGG7GAGCAG CTCGGGCACA TGCCGGCCGG ATC7GGCGGG A7CGCAGAGC 
7 4 941 CGC7CGATGG CGGCGG7G7C GACGCGCAG7 GCGGCG7GGA CGGCGGGG7C G7CGGAGGCC 
7 5001 CGG7AGGCGA AC7CCAGG7A GG7GACGGCC 7CG7CGAGCT CGCCGCGCAG G7GGTGC7CG 
7 5061 CGCGCGGCG7 CGG7GAACAG CCCGGCGACC 7CGGCGCCG7 GCACCCGGCC GG7ACCCA7C 
7 5121 7GG7GGCGGG CGAGCACC77 GC7GGCCACG CCGCGG7CCC GCAGCAG77C CAGCGCCAGC 
50 7 5181 7CG7GCAGGC CACGCCGC7C GGCGGCGGAG AGG7CG7CGA G7ACGACGGA GCGGGCCGCG 
7 5241 GGG7GCGGGA ACCGCCC7TC CCGCAGCAGC CGCCCC7CGA CCAGC7G77C G7GGGCC7GC 
7 5301 TCGACCGCC7 CGGTG7CGAG GCCGG7CA7C CGC7GGACGA GGGTGAG77C GACACTCTCG 
7 5361 CCGAGCACGG CGGAAGC7CG GGCGACGC7C ACCGCGGCCG GGCCGCAACG A7AGAGCGAC 
7 5421 CCGAGG7AGG CGAGCCGG7A CGCCCGCCCC GCGACCACT7 CCAGGCACCC 7GAGG7CCGT 
55 7 5481 G7CCG7GCCT CCCGGATG7C G7CGATCAGG CCG7GGCCGA GGAGCAGG77 GCCGCCGGTC 
7 5 541 GCCCGGAACG CC7GGGCCAC CACG7CG7CG 7GCGCG7CC7 GGCCGAGG7G CCGGCGCACG 
7 5 601 AG7TCGG7GG 7C7GCGCC7C GG7GAGCGGG CGCAGCGCGA TC7CC7GG7A GTGGCGCAGA 
7 5 661 CTCAGCAG7G CCGCCCGGAA 7TGGGAG7GG GCGGGCGTCG GCCGGAGCAG C7CGGTCAGC 
7 5721 ACGA7GGCGA CACGGGCCCG GCTGA7GCGG CGCGCGAGGT GGAGCAGGCA GCGCAGCGAC 
60 7 5781 GGCGCGTCGG CG7GG7GCAC GTCG7CGA7G CCGATCAG7A CGGGCCGCTC CGCGGCGAGC 
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7534 1 GTCAGCACCG TGCGGGTGAG TTCGGTCCCC AGGCGGTTGT CGACGTCGG" CC~~ ■> - , 

7 5 901 TCGCACGATG CCGTCAGCCG GACCAGCTCC GGTGTCCGGG CgJcSgC^C GG-T-^r 
75961 AGGAGCTGGC CGAGCATGCC GTACGGCAGG GCCCGCTCCT cStSaSa clr~4ol^ 
7602 1 AGGGTGACGA AGCCGGCCTT GGCCGCGGCG GCGTCGAGGA GTTCOTTCT? Gr~~lcG" 
7-?! !SS?£ CGG TGACGGCGGC GACGACGCCC CGCCCGCCCC CCGCTCGGGT GAGCG^CCGG 
7*141 TGGAGGGAAC CGAACTCGTC ATCGCGGGCG ATCAGGTCTG GGGGAGATAA GCGCGCTATC 
3"?: ^S G ^ TGGAA CTACCTCGCG ACCGTCGTGG AAACCCATAG GCATCACATG GCT-GWGM 
76^61 C^TACGCCT GTGATTCAGC CTGGCGGGAT GCTGTGCTAC AGATGGGAAG ATG~GATCTA 
in "oi GGGCCGTGCC CTTCCCTCAG GAGCCGACCG CCCCCGGCGC CACCCGCCGT ACCCCCTGGG 

10 7 Dj8 l CCACCAGCTC GGCGACCCGC TCCTGGTGGT CGACGAGGTA GAAGTGCCCG CCGGGGAAGA 
7 644 1 CCTCCACCGT GGTCGGCGCG GTCGTGTGCC CGGCCCAGGC GTGGGCCTGC TCACCGTC* 
7 6501 TCTTCGGATC GTCGTCACCG ATGCACACCG TGATCGGCGT CTCCAGCGGC GGCGCGGGc" 
'65 61 CCCACCGGTA CGTCTCCGCC GCGTAGTAGT CCGCCCGCAA CGGCGCCAGG AT r AGCGCGC 
76621 GCATTTCGTC GTCCGCCATC ACATCGGCGC TCGTCCCGCC GAGGCCGATG ACCGCCGCCA 
15 76681 GCAGCTCGTC GTCGGACGCG AGGTGGTCCT GGTCGGCGCG CGGCTGCGAC GGCG~CCGC~ 
7674 1 GGCCCGAGAC GATCAGGTGC GCCACCGGGA GCCGCTGGGC CAGCTCGAAC GCGAGTGTC- 
7 6801 CGCCCATGCT GTGGCCGAAC AGCACCAGCG GACGGTCCAG CCCCGGC^TC AACG~CTCg" 
7 6S61 CCACGAGGCC GGCGAGAACA CGCAGGTCGC GCACCGCCTC CTCGTCGCGG CGG~CCTGGC 
7 6921 GGCCGGGGTA CTGCACGGCG TACACGTCCG CCACCGGGGC GAGCGCACGG GCCAGCGGAA 
7 6981 GGTAGAACGT CGCCGATCCG CCGGCGTGGG GCAGCAGCAC CACCCGTACC GGGGCCTCGG 
77041 GCGTGGGGAA GAACTGCCGC AGCCAGAGTT CCGAGCTCAC CGCACCCCCT CGGCTGCGAC 
77101 CTGGGGAGCC CGGAACCGGG TGATCTCGGC CAAGTGCTTC TCCCGCATCT CCGGGTCGGT 
77161 CACGCCCCAT CCCTCCTCCG GCGCCAGACA GAGGACGCCG ACTTTGCCGT TGTGCACATT 
7 7221 GCGATGCACA TCGCGCACCG CCGACCCGAC GTCGTCGAGC GGGTAGGTCA CCGACAGCGT 
-5 77281 CGGGTGCACC ATCCCCTTGC AGATCAGGCG GTTCGCCTCC CACGCCTCAC GATAGTTCGC 
77 341 GAAGTGGGTA CCGATGATCC GCTTCACGGA CATCCACAGG TACCGATTGT CAAAGGCGTG 
774 01 CTCGTATCCC GAGGTTGACG CGCAGGTGAC GATCGTGCCA CCCCGACGTG TCACGTAGAC 
77 4 61 ACTCGCGCCG AACGTCGCGC GCCCCGGGTG CTCGAACACG ATGTCGGGAT CGTCACCGCC 
77 521 GGTCAGCTCC CGGATC 

30 

Those of skill in the art will recognize that, due to the degenerate nature of the 
genetic code, a variety of DNA compounds differing in their nucleotide sequences can be 
used to encode a given amino acid sequence of the invention. The native DNA sequence 
encoding the FK-520 PKS of Streptomyces hygroscopicus is shown herein merely to 

35 illustrate a preferred embodiment of the invention, and the present invention includes DNA 
compounds of any sequence that encode the amino acid sequences of the polypeptides and 
proteins of the invention. In similar fashion, a polypeptide can typically tolerate one or more 
amino acid substitutions, deletions, and insertions in its amino acid sequence without loss or 
significant loss of a desired activity. The present invention includes such polypeptides with 

40 alternate amino acid sequences, and the amino acid sequences shown merely illustrate 
preferred embodiments of the invention. 

The recombinant nucleic acids, proteins, and peptides of the invention are many and 
diverse. To facilitate an understanding of the invention and the diverse compounds and 
methods provided thereby, the following general description of the FK-520 PKS genes and 

45 modules of the PKS proteins encoded thereby is provided. This general description is 

followed by a more detailed description of the various domains and modules of the FK-520 
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PKS contained in and encoded by the compounds of the invention. In this description, 
reference to a heterologous PKS refers to any PKS other than the FK-520 PKS. Unless 
otherwise indicated, reference to a PKS includes reference to a portion of a PKS. Moreover, 
reference to a domain, module, or PKS includes reference to the nucleic acids encoding the 
5 same and vice-versa, because the methods and reagents of the invention provide or enable 
one to prepare proteins and the nucleic acids that encode them. 

The FK-520 PKS is composed of three proteins encoded by three genes designated 
JkbA.jkbB. and JkbC. The JkbA ORF encodes extender modules 7 - 10 of the PKS. The JkbB 
ORF encodes the loading module (the CoA ligase) and extender modules 1 - 4 of the PKS. 
1 0 The JkbC ORF encodes extender modules 5 - 6 of the PKS. The JkbP ORF encodes the 
NRPS that attaches the pipecolic acid and cyclizes the FK-520 polyketide. 

The loading module of the FK-520 PKS includes a CoA ligase, an ER domain, and 
an ACP domain. The starter building block or unit for FK-520 is believed to be a 
dihydroxycyclohexene carboxylic acid, which is derived from shikimate. The recombinant 
1 5 DNA compounds of the invention that encode the loading module of the FK-520 PKS and 

the corresponding polypeptides encoded thereby are useful for a variety of methods and in a '| 
variety of compounds. In one embodiment, a DNA compound comprising a sequence that 
encodes the FK-520 loading module is inserted into a DNA compound that comprises the 
coding sequence for a heterologous PKS. The resulting construct, in which the coding 
20 sequence for the loading module of the heterologous PKS is replaced by the coding 

sequence for the FK-520 loading module, provides a novel PKS coding sequence. Examples 
of heterologous PKS coding sequences include the rapamycin, FK-506, rifamycin, and 
avermectin PKS coding sequences. In another embodiment, a DNA compound comprising a 
sequence that encodes the FK-520 loading module is inserted into a DNA compound that 
25 comprises the coding sequence for the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, a portion of the loading module coding sequence is utilized 
in conjunction with a heterologous coding sequence. In this embodiment, the invention 
provides, for example, either replacing the CoA ligase with a different CoA ligase, deleting 
30 the ER. or replacing the ER with a different ER. In addition, or alternatively, the ACP can 
be replaced by another ACP. In similar fashion, the corresponding domains in another 
loading or extender module can be replaced by one or more domains of the FK-520 PKS. 
The resulting heterologous loading module coding sequence can be utilized in conjunction 

SUBSTITUTE SHEET (RULE 26) 



BNSDOCID: <WO 0020601 A2 1A> 



WO 00/20601 PCT/US99/22886 

48 

with a coding sequence for a PKS that synthesizes FK-520, an FK-520 derivative, or 
another polyketide. 

The first extender module of the FK-520 PKS includes a KS domain, an AT domain 
specific for methylmalonyl CoA, a DH domain, a KR domain, and an ACP domain. The 
5 recombinant DNA compounds of the invention that encode the first extender module of the 
FK-520 PKS and the corresponding polypeptides encoded thereby are useful for a variety of 
applications. In one embodiment, a DNA compound comprising a sequence that encodes the 
FK-520 first extender module is inserted into a DNA compound that comprises the coding 
sequence for a heterologous PKS. The resulting construct, in which the coding sequence for 
1 0 a module of the heterologous PKS is either replaced by that for the first extender module of 
the FK-520 PKS or the latter is merely added to coding sequences for modules of the 
heterologous PKS, provides a novel PKS coding sequence. In another embodiment, a DNA 
compound comprising a sequence that encodes the first extender module of the FK-520 
PKS is inserted into a DNA compound that comprises the remainder of the coding sequence 
15 for the FK-520 PKS or a recombinant FK-520 PKS that produces an FK-520 derivative. 

In another embodiment, all or only a portion of the first extender module coding 
sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 
methylmalonyl CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2- 
20 hydroxymalonyl CoA specific AT; deleting either the DH or KR or both; replacing the DH 
or KR or both with another DH or ICR; and/or inserting an ER. In replacing or inserting KR, 
DH. and ER domains, it is often beneficial to replace the existing ICR, DH, and ER domains 
with the complete set of domains desired from another module. Thus, if one desires to insert 
an ER domain, one may simply replace the existing KR and DH domains with a KR. DH, 
25 and ER set of domains, from a module containing such domains. In addition, the KS and/or 
ACP can be replaced with another KS and/or ACP. In each of these replacements or 
insertions, the heterologous KS, AT, DH, KR. ER, or ACP coding sequence can originate 
from a coding sequence for another module of the FK-520 PKS, from a gene for a PKS that 
produces a polyketide other than FK-520, or from chemical synthesis. The resulting 
30 heterologous first extender module coding sequence can be utilized in conjunction with a 
coding sequence for a PKS that synthesizes FK-520, an FK-520 derivative, or another 
polyketide. In similar fashion, the corresponding domains in a module of a heterologous 
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PKS can be replaced by one or more domains of the first extender module of the FK-520 
PKS. 

In an illustrative embodiment of this aspect of the invention, the invention provides 
recombinant PKSs and recombinant DNA compounds and vectors that encode such PKSs in 
5 which the KS domain of the first extender module has been inactivated. Such constructs are 
especially useful when placed in translation^ reading frame with the remaining modules 
and domains of an FK-520 or FK-520 derivative PKS. The utility of these constructs is that 
host cells expressing, or cell free extracts containing, the PKS encoded thereby can be fed or 
supplied with N-acyicysteamine thioesters of novel precursor molecules to prepare FK-520 
10 derivatives. See U.S. patent application Serial No. 60/1 17,384, filed 27 Jan. 1999, and PCT 
patent publication Nos. US97/02358 and US99/03986, each of which is incorporated herein 
by reference. 

The second extender module of the FK-520 PKS includes a KS, an AT specific for 
methylmalonyl CoA, a KR, an inactive DH, and an ACP. The recombinant DNA 

1 5 compounds of the invention that encode the second extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of applications. 
In one embodiment, a DNA compound comprising a sequence that encodes the FK-520 
second extender module is inserted into a DNA compound that comprises the coding 
sequence for a heterologous PKS. The resulting construct, in which the coding sequence for 

20 a module of the heterologous PKS is either replaced by that for the second extender module 
of the FK-520 PKS or the latter is merely added to coding sequences for the modules of the 
heterologous PKS, provides a novel PKS coding sequence. In another embodiment, a DNA 
compound comprising a sequence that encodes the second extender module of the FK-520 
PKS is inserted into a DNA compound that comprises the coding sequence for the 

25 remainder of the FK-520 PKS or a recombinant FK-520 PKS that produces an FK-520 
derivative. 

In another embodiment, all or a portion of the second extender module coding 
sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 
30 methylmalonyl CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2- 

hydroxymalonyl CoA specific AT; deleting the KR and/or the inactive DH; replacing the 
KR with another KR; and/or inserting an active DH or an active DH and an ER. In addition, 
the KS and/or ACP can be replaced with another KS and/or ACP. In each of these 
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replacements or insertions, the heterologous KS. AT, DH, KR, ER. or ACP coding 
sequence can originate from a coding sequence for another module of the FK-520 PKS. 
from a coding sequence for a PKS that produces a polyketide other than FK-520. or from 
chemical synthesis. The resulting heterologous second extender module coding sequence 
can be utilized in conjunction with a coding sequence from a PKS that synthesizes FK-520, 
an FK-520 derivative, or another polyketide. In similar fashion, the corresponding domains 
in a module of a heterologous PKS can be replaced by one or more domains of the second 
extender module of the FK-520 PKS. 

The third extender module of the FK-520 PKS includes a KS, an AT specific for 
malonyl CoA, a KR, an inactive DH, and an ACP. The recombinant DNA compounds of the 
invention that encode the third extender module of the FK-520 PKS and the corresponding 
polypeptides encoded thereby are useful for a variety of applications. In one embodiment, a 
DNA compound comprising a sequence that encodes the FK-520 third extender module is 
inserted into a DNA compound that comprises the coding sequence for a heterologous PKS. 
1 5 The resulting construct, in which the coding sequence for a module of the heterologous PKS 
is either replaced by that for the third extender module of the FK-520 PKS or the latter is 
merely added to coding sequences for the modules of the heterologous PKS, provides a 
novel PKS coding sequence. In another embodiment, a DNA compound comprising a 
sequence that encodes the third extender module of the FK-520 PKS is inserted into a DNA 
20 compound that comprises the coding sequence for the remainder of the FK-520 PKS or a 
recombinant FK-520 PKS that produces an FK-520 derivative. 

In another embodiment, all or a portion of the third extender module coding 
sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 
25 malonyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 2- 

hydroxymalonyl CoA specific AT; deleting the KR and/or the inactive DH; replacing the 
KR with another KR; and/or inserting an active DH or an active DH and an ER. In addition, 
the KS and/or ACP can be replaced with another KS and/or ACP. In each of these 
replacements or insertions, the heterologous KS, AT, DH, KR, ER, or ACP coding 
30 sequence can originate from a coding sequence for another module of the FK-520 PKS, 
from a coding sequence for a PKS that produces a polyketide other than FK-520, or from 
chemical synthesis. The resulting heterologous third extender module coding sequence can 
be utilized in conjunction with a coding sequence from a PKS that synthesizes FK-520, an 
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FK-520 derivative, or another polyketide. In similar fashion, the corresponding domains in < 
module of a heterologous PKS can be replaced by one or more domains of the third 
extender module of the FK-520 PKS. 

The fourth extender module of the FK-520 PKS includes a KS, an AT that binds 
5 ethylmalonyl CoA, an inactive DH, and an ACP. The recombinant DNA compounds of the 
invention that encode the fourth extender module of the FK-520 PKS and the corresponding 
polypeptides encoded thereby are useful for a variety of applications. In one embodiment, a 
DNA compound comprising a sequence that encodes the FK-520 fourth extender module is 
inserted into a DNA compound that comprises the coding sequence for a heterologous PKS. 
1 0 The resulting construct, in which the coding sequence for a module of the heterologous PKS 
is either replaced by that for the fourth extender module of the FK-520 PKS or the latter is 
merely added to coding sequences for the modules of the heterologous PKS, provides a 
novel PKS coding sequence. In another embodiment, a DNA compound comprising a 
sequence that encodes the fourth extender module of the FK-520 PKS is inserted into a 
1 5 DNA compound that comprises the remainder of the coding sequence for the FK-520 PKS 
or a recombinant FK-520 PKS that produces an FK-520 derivative. 

In another embodiment, a portion of the fourth extender module coding sequence is 
utilized in conjunction with other PKS coding sequences to create a hybrid module. In this 
embodiment, the invention provides, for example, either replacing the ethylmalonyl CoA 
20 specific AT with a malonyl CoA, methylmalonyl CoA, or 2-hydroxymalonyl CoA specific 
AT; and/or deleting the inactive DH, inserting a KR, a KR and an active DH, or a KR, an 
active DH, and an ER. In addition, the KS and/or ACP can be replaced with another KS 
and/or ACP. In each of these replacements or insertions, the heterologous KS, AT, DH, KR, 
ER, or ACP coding sequence can originate from a coding sequence for another module of 
25 the FK-520 PKS, a PKS for a polyketide other than FK-520, or from chemical synthesis. 
The resulting heterologous fourth extender module coding sequence can be utilized in 
conjunction with a coding sequence for a PKS that synthesizes FK-520, an FK-520 
derivative, or another polyketide. In similar fashion, the corresponding domains in a module 
of a heterologous PKS can be replaced by one or more domains of the fourth extender 
30 module of the FK-520 PKS. 

As illustrative examples, the present invention provides recombinant genes, vectors, 
and host cells that result from the conversion of the FK-506 PKS to an FK-520 PKS and 
vice- versa. In one embodiment, the invention provides a recombinant set of FK-506 PKS 
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genes but in which the coding sequences for the fourth extender module or at least those for 
the AT domain in the fourth extender module have been replaced by those for the AT 
domain of the fourth extender module of the FK-520 PKS. This recombinant PKS can be 
used to produce FK-520 in recombinant host cells. In another embodiment, the invention 
provides a recombinant set of FK-520 PKS genes but in which the coding sequences for the 
fourth extender module or at least those for the AT domain in the fourth extender module 
have been replaced by those for the AT domain of the fourth extender module of the FK- 
506 PKS. This recombinant PKS can be used to produce FK-506 in recombinant host cells. 

Other examples of hybrid PKS enzymes of the invention include those in which the 
AT domain of module 4 has been replaced with a malonyl specific AT domain to provide a 
PKS that produces 21-desethyI-FK520 or with a methylmalonyl specific AT domain to 
provide a PKS that produces 21-desethyI-21-methyl-FK520. Another hybrid PKS of the 
invention is prepared by replacing the AT and inactive KR domain of FK-520 extender 
module 4 with a methylmalonyl specific AT and an active KR domain, such as, for 
1 5 example, from module 2 of the DEBS or oleandolide PKS enzymes, to produce 21-desethyl- 
21-methyl-22-desoxo-22-hydroxy-FK520. The compounds produced by these hybrid PKS 
enzymes are neurotrophic. 

The fifth extender module of the FK-520 PKS includes a KS, an AT that binds 
methylmalonyl CoA, a DH, a KR, and an ACP. The recombinant DNA compounds of the 
20 invention that encode the fifth extender module of the FK-520 PKS and the corresponding 
polypeptides encoded thereby are useful for a variety of applications. In one embodiment, a 
DNA compound comprising a sequence that encodes the FK-520 fifth extender module is 
inserted into a DNA compound that comprises the coding sequence for a heterologous PKS. 
The resulting construct, in which the coding sequence for a module of the heterologous PKS 
25 is either replaced by that for the fifth extender module of the FK-520 PKS or the latter is 
merely added to coding sequences for the modules of the heterologous PKS, provides a 
novel PKS. In another embodiment, a DNA compound comprising a sequence that encodes 
the fifth extender module of the FK-520 PKS is inserted into a DNA compound that 
comprises the coding sequence for the FK-520 PKS or a recombinant FK-520 PKS that 
30 produces an FK-520 derivative. 

In another embodiment, a portion of the fifth extender module coding sequence is 
utilized in conjunction with other PKS coding sequences to create a hybrid module. In this 
embodiment, the invention provides, for example, either replacing the methylmalonyl CoA 
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specific AT with a malonyl CoA, ethyimalonyl CoA, or 2-hydroxymalonyI CoA specific 
AT; deleting any one or both of the DH and KR; replacing any one or both of the DH and 
KR with either a KR and/or DH; and/or inserting an ER. In addition, the KS and/or ACP 
can be replaced with another KS and/or ACP. In each of these replacements or insertions, 
the heterologous KS, AT, DH, KR, ER, or ACP coding sequence can originate from a 
coding sequence for another module of the FK-520 PKS, from a coding sequence for a PKS 
that produces a polyketide other than FK-520, or from chemical synthesis. The resulting 
heterologous fifth extender module coding sequence can be utilized in conjunction with a 
coding sequence for a PKS that synthesizes FK-520, an FK-520 derivative, or another 
polyketide. In similar fashion, the corresponding domains in a module of a heterologous 
PKS can be replaced by one or more domains of the fifth extender module of the FK-520 
PKS. 

In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the DH domain of the fifth extender 
15 module have been deleted or mutated to render the DH non-functional. In one such mutated 
gene, the KR and DH coding sequences are replaced with those encoding only a KR domain 
from another PKS gene. The resulting PKS genes code for the expression of an FK-520 
PKS that produces an FK-520 analog that lacks the C- 19 to C-20 double bond of FK-520 
and has a C-20 hydroxyl group. Such analogs are preferred neurotrophins, because they 

20 have little or no immunosuppressant activity. This recombinant fifth extender module 
coding sequence can be combined with other coding sequences to make additional 
compounds of the invention. In an illustrative embodiment, the present invention provides a 
recombinant FK-520 PKS that contains both this fifth extender module and the recombinant 
fourth extender module described above that comprises the coding sequence for the fourth 

25 extender module AT domain of the FK-506 PKS. The invention also provides recombinant 
host cells derived from FK-506 producing host cells that have been mutated to prevent 
production of FK-506 but that express this recombinant PKS and so synthesize the 
corresponding (lacking the C-19 to C-20 double bond of FK-506 and having a C-20 
hydroxy! group) FK-506 derivative. In another embodiment, the present invention provides 

30 a recombinant FK-506 PKS in which the DH domain of module 5 has been deleted or 
otherwise rendered inactive and thus produces this novel polyketide. 

The sixth extender module of the FK-520 PKS includes a KS, an AT specific for 
methylmalonyl CoA, a KR, a DH. an ER. and an ACP. The recombinant DNA compounds 
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of the invention that encode the sixth extender module of the FK-520 PKS and the 
corresponding polypeptides encoded thereby are useful for a variety of applications. In one 
embodiment, a DNA compound comprising a sequence that encodes the FK-520 sixth 
extender module is inserted into a DNA compound that comprises the coding sequence for a 
5 heterologous PKS. The resulting construct, in which the coding sequence for a module of 
the heterologous PKS is either replaced by that for the sixth extender module of the FK-520 
PKS or the latter is merely added to coding sequences for the modules of the heterologous 
PKS, provides a novel PKS coding sequence. In another embodiment, a DNA compound 
comprising a sequence that encodes the sixth extender module of the FK-520 PKS is 
1 0 inserted into a DNA compound that comprises the coding sequence for the remainder of the 
FK-520 PKS or a recombinant FK-520 PKS that produces an FK-520 derivative. 

In another embodiment, a portion of the sixth extender module coding sequence is 
utilized in conjunction with other PKS coding sequences to create a hybrid module. In this 
embodiment, the invention provides, for example, either replacing the methylmalonyl CoA 
1 5 specific AT with a malonyl CoA, ethylmalonyl CoA, or 2-hydroxymalonyl CoA specific 
AT; deleting any one, two, or all three of the KR, DH, and ER; and/or replacing any one, 
two, or all three of the KR, DH, and ER with another KR, DH, and ER. In addition, the KS 
and/or ACP can be replaced with another KS and/or ACP. In each of these replacements, 
the heterologous KS, AT, DH, KR, ER, or ACP coding sequence can originate from a 
20 coding sequence for another module of the FK-520 PKS, from a coding sequence for a PKS 
that produces a polyketide other than FK-520, or from chemical synthesis. The resulting 
heterologous sixth extender module coding sequence can be utilized in conjunction with a 
coding sequence for a PKS that synthesizes FK-520, an FK-520 derivative, or another 
polyketide. In similar fashion, the corresponding domains in a module of a heterologous 
25 PKS can be replaced by one or more domains of the sixth extender module of the FK-520 
PKS. 

In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the DH and ER domains of the sixth 
extender module have been deleted or mutated to render them non-functional. In one such 
30 mutated gene, the KR, ER, and DH coding sequences are replaced with those encoding only 
a KR domain from another PKS gene. This can also be accomplished by simply replacing 
the coding sequences for extender module six with those for an extender module having a 
methylmalonyl specific AT and only a KR domain from a heterologous PKS gene, such as, 

SUBSTITUTE SHEET (RULE 26) 



BNSDOCID <WO _. 0020601 A2 IA> 



WO 00/20601 PCT/US99/22886 

55 

for example, the coding sequences for extender module two encoded by the ervvl/ gene. The 
resulting PKS genes code for the expression of an FK-520 PKS that produces an FK-520 
analog that has a C-l 8 hydroxyl group. Such analogs are preferred neurotrophic, because 
they have little or no immunosuppressant activity. This recombinant sixth extender module 
5 coding sequence can be combined with other coding sequences to make additional 

compounds of the invention. In an illustrative embodiment, the present invention provides a 
recombinant FK-520 PKS that contains both this sixth extender module and the 
recombinant fourth extender module described above that comprises the coding sequence 
for the fourth extender module AT domain of the FK-506 PKS. The invention also provides 

1 0 recombinant host cells derived from FK-506 producing host cells that have been mutated to 
prevent production of FK-506 but that express this recombinant PKS and so synthesize the 
corresponding (having a C-l 8 hydroxyl group) FK-506 derivative. In another embodiment, 
the present invention provides a recombinant FK-506 PKS in which the DH and ER 
domains of module 6 have been deleted or otherwise rendered inactive and thus produces 

15 this novel polyketide. -r 

The seventh extender module of the FK-520 PKS includes a KS, an AT specific for 
2-hydroxymalonyl CoA, a KR, a DH, an ER, and an ACP. The recombinant DNA 
compounds of the invention that encode the seventh extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of applications. 

20 In one embodiment, a DNA compound comprising a sequence that encodes the FK-520 
seventh extender module is inserted into a DNA compound that comprises the coding 
sequence for a heterologous PKS. The resulting construct, in which the coding sequence for 
a module of the heterologous PKS is either replaced by that for the seventh extender module 
of the FK-520 PKS or the latter is merely added to coding sequences for the modules of the 

25 heterologous PKS, provides a novel PKS coding sequence. In another embodiment, a DNA 
compound comprising a sequence that encodes the seventh extender module of the FK-520 
PKS is inserted into a DNA compound that comprises the coding sequence for the 
remainder of the FK-520 PKS or a recombinant FK-520 PKS that produces an FK-520 
derivative, 

30 - In another embodiment, a portion or all of the seventh extender module coding 

sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 2- 
hydroxymalonyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 
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malonyl CoA specific AT; deleting the KR, the DH, and/or the ER; and/or replacing the 
KR, DH, and/or ER. In addition, the KS and/or ACP can be replaced with another KS 
and/or ACP. In each of these replacements or insertions, the heterologous KS, AT. DH, KR. 
ER, or ACP coding sequence can originate from a coding sequence for another module of 
5 the FK-520 PKS, from a coding sequence for a PKS that produces a polyketide other than 
FK-520, or from chemical synthesis. The resulting heterologous seventh extender module 
coding sequence can be utilized in conjunction with a coding sequence for a PKS that 
synthesizes FK-520, an FK-520 derivative, or another polyketide. In similar fashion, the 
corresponding domains in a module of a heterologous PKS can be replaced by one or more 

10 domains of the seventh extender module of the FK-520 PKS. 

In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the AT domain of the seventh 
extender module has been replaced with those encoding an AT domain for malonyl, 
methylmalonyl, or ethylmalonyl CoA from another PKS gene. The resulting PKS genes 

1 5 code for the expression of an FK-520 PKS that produces an FK-520 analog that lacks the C- 
15 methoxy group, having instead a hydrogen, methyl, or ethyl group at that position, 
respectively. Such analogs are preferred, because they are more slowly metabolized than 
FK-520. This recombinant seventh extender module coding sequence can be combined with 
other coding sequences to make additional compounds of the invention. In an illustrative 

20 embodiment, the present invention provides a recombinant FK-520 PKS that contains both 
this seventh extender module and the recombinant fourth extender module described above 
that comprises the coding sequence for the fourth extender module AT domain of the FK- 
506 PKS. The invention also provides recombinant host ceils derived from FK-506 
producing host cells that have been mutated to prevent production of FK-506 but that 

25 express this recombinant PKS and so synthesize the corresponding (C-15-desmethoxy) FK- 
506 derivative. In another embodiment, the present invention provides a recombinant FK- 
506 PKS in which the AT domain of module 7 has been replaced and thus produces this 
novel polyketide. 

In another illustrative embodiment, the present invention provides a hybrid PKS in 
30 which the AT and KR domains of module 7 of the FK-520 PKS are replaced by a 

methylmalonyl specific AT domain and an inactive KR domain, such as ? for example, the 
AT and KR domains of extender module 6 of the rapamycin PKS. The resulting hybrid PKS 
produces 15-desmethoxy-15-methyl-16-oxo-FK-520, a neurotrophin compound. 
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The eighth extender module of the FK-520 PKS includes a KS, an AT specific for 2- 
hydroxymalonyl CoA, a KR, and an ACP. The recombinant DNA compounds of the 
invention that encode the eighth extender module of the FK-520 PKS and the corresponding 
polypeptides encoded thereby are useful for a variety of applications. In one embodiment, a 
5 DNA compound comprising a sequence that encodes the FK-520 eighth extender module is 
inserted into a DNA compound that comprises the coding sequence for a heterologous PKS. 
The resulting construct, in which the coding sequence for a module of the heterologous PKS 
is either replaced by that for the eighth extender module of the FK-520 PKS or the latter is 
merely added to coding sequences for the modules of the heterologous PKS, provides a 
10 novel PKS coding sequence. In another embodiment, a DNA compound comprising a 
sequence that encodes the eighth extender module of the FK-520 PKS is inserted into a 
DNA compound that comprises the coding sequence for the remainder of the FK-520 PKS 
or a recombinant FK-520 PKS that produces an FK-520 derivative. 

In another embodiment, a portion of the eighth extender module coding sequence is 
15 utilized in conjunction with other PKS coding sequences to create a hybrid module. In this 
embodiment, the invention provides, for example, either replacing the 2-hydroxymalonyi 
CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or malonyl CoA specific 
AT; deleting or replacing the KR; and/or inserting a DH or a DH and an ER. In addition, the 
KS and/or ACP can be replaced with another KS and/or ACP. In each of these 
20 replacements, the heterologous KS, AT, DH, KR, ER, or ACP coding sequence can 

originate from a coding sequence for another module of the FK-520 PKS, from a coding 
sequence for a PKS that produces a polyketide other than FK-520, or from chemical 
synthesis. The resulting heterologous eighth extender module coding sequence can be - 
utilized in conjunction with a PKS that synthesizes FK-520, an FK-520 derivative, or 
25 another polyketide. In similar fashion, the corresponding domains in a module of a 

heterologous PKS can be replaced by one or more domains of the eighth extender module of 
the FK-520 PKS. 

In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the AT domain of the eighth extender 
30 module has been replaced with those encoding an AT domain for malonyl, methylmalonyL 
or ethylmalonyl CoA from another PKS gene. The resulting PKS genes code for the 
expression of an FK-520 PKS that produces an FK-520 analog that lacks the C-13 mcthoxy 
group, having instead a hydrogen, methyl, or ethyl group at that position, respectively. Such 
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analogs are preferred, because they are more slowly metabolized than FK-520. This 
recombinant eighth extender module coding sequence can be combined with other coding 
sequences to make additional compounds of the invention. In an illustrative embodiment, 
the present invention provides a recombinant FK-520 PKS that contains both this eighth 
extender module and the recombinant fourth extender module described above that 
comprises the coding sequence for the fourth extender module AT domain of the FK-506 
PKS. The invention also provides recombinant host cells derived from FK-506 producing 
host cells that have been mutated to prevent production of FK-506 but that express this 
recombinant PKS and so synthesize the corresponding (C-13-desmethoxy) FK-506 
derivative. In another embodiment, the present invention provides a recombinant FK-506 
PKS in which the AT domain of module 8 has been replaced and thus produces this novel 
polyketide. 

The ninth extender module of the FK-520 PKS includes a KS, an AT specific for 
methylmalony! Co A, a KR, a DH, an ER, and an ACP. The recombinant DNA compounds 

1 5 of the invention that encode the ninth extender module of the FK-520 PKS and the 

corresponding polypeptides encoded thereby are useful for a variety of applications. In one 
embodiment, a DNA compound comprising a sequence that encodes the FK-520 ninth 
extender module is inserted into a DNA compound that comprises the coding sequence for a 
heterologous PKS. The resulting construct, in which the coding sequence for a module of 

20 the heterologous PKS is either replaced by that for the ninth extender module of the FK-520 
PKS or the latter is merely added to coding sequences for the modules of the heterologous 
PKS, provides a novel PKS coding sequence. In another embodiment, a DNA compound 
comprising a sequence that encodes the ninth extender module of the FK-520 PKS is 
inserted into a DNA compound that comprises the coding sequence for the remainder of the 

25 FK-520 PKS or a recombinant FK-520 PKS that produces an FK-520 derivative. 

In another embodiment, a portion of the ninth extender module coding sequence is 
utilized in conjunction with other PKS coding sequences to create a hybrid module. In this 
embodiment, the invention provides, for example, either replacing the methylmalonyl CoA 
specific AT with a malonyl CoA, ethylmalonyl CoA, or 2-hydroxymalonyl CoA specific 

30 AT; deleting any one. two, or all three of the KR, DH, and ER; and/or replacing any one, 
two, or all three of the KR, DH, and ER with another KR, DH, and/or ER. In addition, the 
KS and/or ACP can be replaced with another KS and/or ACP. In each of these 
replacements, the heterologous KS, AT, DH, KR, ER, or ACP coding sequence can 
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originate from a coding sequence for another module of the FK-520 PKS, from a coding 
sequence for a PKS that produces a polyketide other than FK-520. or from chemical 
synthesis. The resulting heterologous ninth extender module coding sequence can be 
utilized in conjunction with a PKS that synthesizes FK-520, an FK-520 derivative, or 
5 another polyketide. In similar fashion, the corresponding domains in a module of a 

heterologous PKS can be replaced by one or more domains of the ninth extender module of 
the FK-520 PKS. 

The tenth extender module of the FK-520 PKS includes a KS, an AT specific for 
malonyl Co A, and an ACP. The recombinant DNA compounds of the invention that encode 
1 0 the tenth extender module of the FK-520 PKS and the corresponding polypeptides encoded 
thereby are useful for a variety of applications. In one embodiment, a DNA compound 
comprising a sequence that encodes the FK-520 tenth extender module is inserted into a 
DNA compound that comprises the coding sequence for a heterologous PKS. The resulting 
construct, in which the coding sequence for a module of the heterologous PKS is either 

1 5 replaced by that for the tenth extender module of the FK-520 PKS or the latter is merely 
added to coding sequences for the modules of the heterologous PKS, provides a novel PKS 
coding sequence. In another embodiment, a DNA compound comprising a sequence that 
encodes the tenth extender module of the FK-520 PKS is inserted into a DNA compound 
that comprises the coding sequence for the remainder of the FK-520 PKS or a recombinant 

20 FK-520 PKS that produces an FK-520 derivative. 

In another embodiment, a portion or ail of the tenth extender module coding 
sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 
malonyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 2- 

25 hydroxymalonyl CoA specific AT; and/or inserting a KR, a KR and DH, or a KR, DH, and 
an ER. In addition, the KS and/or ACP can be replaced with another KS and/or ACP. In 
each of these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or ACP 
coding sequence can originate from a coding sequence for another module of the FK-520 
PKS, from a coding sequence for a PKS that produces a polyketide other than FK-520, or 

30 from chemical synthesis. The resulting heterologous tenth extender module coding sequence 
can be utilized in conjunction with a coding sequence for a PKS that synthesizes FK-520, an 
FK-520 derivative, or another polyketide. In similar fashion, the corresponding domains in a 
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module of a heterologous PKS can be replaced by one or more domains of the tenth 
extender module of the FK-520 PKS. 

The FK-520 polyketide precursor produced by the action of the tenth extender 
module of the PKS is then attached to pipecolic acid and cyclized to form FK-520. The 
5 enzyme FkbP is the NRPS like enzyme that catalyzes these reactions. FkbP also includes a 
thioesterase activity that cleaves the nascent FK-520 polyketide from the NRPS. The 
present invention provides recombinant DNA compounds that encode the fkbP gene and so 
provides recombinant methods for expressing the fkbP gene product in recombinant host 
cells. The recombinant JkbP genes of the invention include those in which the coding 
1 0 sequence for the adenylation domain has been mutated or replaced with coding sequences 
from other NRPS like enzymes so that the resulting recombinant FkbP incorporates a 
moiety other than pipecolic acid. For the construction of host cells that do not naturally 
produce pipecolic acid, the present invention provides recombinant DNA compounds that 
express the enzymes that catalyze at least some of the biosynthesis of pipecolic acid (see 
1 5 Nielsen et al^ 1991, Biochem. 30: 5789-96). The fkbL gene encodes a homolog of RapL, a 
lysine cyclodeaminase responsible in part for producing the pipecolate unit added to the end 
of the polyketide chain. TheJkbB and fkbL recombinant genes of the invention can be used 
in heterologous hosts to produce compounds such as FK-520 or, in conjunction with other 
PKS or NRPS genes, to produce known or novel polyketides and non-ribosmal peptides. 
20 The present invention also provides recombinant DNA compounds that encode the 

P450 oxidase and methyltransferase genes involved in the biosynthesis of FK-520. Figure 2 
shows the various sites on the FK-520 polyketide core structure at which these enzymes act. 
By providing these genes in recombinant form, the present invention provides recombinant 
host cells that can produce FK-520. This is accomplished by introducing the recombinant 
25 PKS, P450 oxidase, and methyltransferase genes into a heterologous host cell. In a preferred 
embodiment, the heterologous host cell is Streptomyces coelicolor CH999 or Streptomyces 
lividans K4-1 14, as described in U.S. Patent No. 5,830,750 and U.S. patent application 
Serial Nos r 08/828,898, filed 31 Mar. 1997, and 09/181,833, filed 28 Oct. 1998, each of 
which is incorporated herein by reference. In addition, by providing recombinant host cells 
30 that express only a subset of these genes, the present invention provides methods for making 
FK-520 precursor compounds not readily obtainable by other means. 

In a related aspect, the present invention provides recombinant DNA compounds 
and vectors that are useful in generating, by homologous recombination, recombinant host 



SUBSTITUTE SHEET (RULE 26) 



BNSDOCID: <WO 0O206O1A2 iA> 



WO 00/20601 PCT/US99/22886 

61 

ceils that produce FK-520 precursor compounds. In this aspect of the invention, a native 
host cell that produces FK-520 is transformed with a vector (such as an SCP2* derived 
vector for Sireptomyces host cells) that encodes one or more disrupted genes (i.e., a 
hydroxylase, a methyltransferase, or both) or merely flanking regions from those genes. 
5 When the vector integrates by homologous recombination, the native, functional gene is 
deleted or replaced by the non-functional recombinant gene, and the resulting host cell thus 
produces an FK-520 precursor. Such host cells can also be complemented by introduction of 
a modified form of the deleted or mutated non-functional gene to produce a novel 
compound. 

1 0 In one important embodiment, the present invention provides a hybrid PKS and the 

corresponding recombinant DNA compounds that encode those hybrid PKS enzymes. For 
purposes of the present invention a hybrid PKS is a recombinant PKS that comprises all or 
pan of one or more modules and thioesterase/cyclase domain of a first PKS and all or pan 
of one or more modules, loading module, and thioesterase/cyclase domain of a second PKS. 

1 5 In one preferred embodiment, the first PKS is all or pan of the FK-520 PKS, and the second 
PKS is only a ponion or all of a non-FK-520 PKS. 

One example of the preferred embodiment is an FK-520 PKS in which the AT 
domain of module 8, which specifies a hydroxymalonyl CoA and from which the C-13 
methoxy group of FK-520 is derived, is replaced by an AT domain that specifies a malonyl, 

20 methylmalonyl, or ethylmalonyl CoA. Examples of such replacement AT domains include 
the AT domains from modules 3, 12, and 13 of the rapaymycin PKS and from modules 1 
and 2 of the erythromycin PKS. Such replacements, conducted at the level of the gene for 
the PKS, are illustrated in the examples below. Another illustrative example of such a 
hybrid PKS includes an FK-520 PKS in which the natural loading module has been replaced 

25 with a loading module of another PKS. Another example of such a hybrid PKS is an FK- 
520 PKS in which the AT domain of module three is replaced with an AT domain that binds 
methylmalonyl CoA. 

In another preferred embodiment, the first PKS is most but not all of a non-FK-520 
PKS, and the second PKS is only a portion or all of the FK-520 PKS. An illustrative 

30 example of such a hybrid PKS includes an erythromycin PKS in which an AT specific for 
methylmalonyl CoA is replaced with an AT from the FK-520 PKS specfic for malonyl 
CoA. 
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Those of skill in the an will recognize that all or pan of either the first or second 
PKS in a hybrid PKS of the invention need not be isolated from a naturally occurring 
source. For example, only a small pomon of an AT domain determines its specificity. See 
U.S. provisional patent application Serial No. 60/091,526, incorporated herein by reference. 
5 The state of the an in DNA synthesis allows the anisan to construct de novo DNA 

compounds of size sufficient to construct a useful portion of a PKS module or domain. For 
purposes of the present invention, such synthetic DNA compounds are deemed to be a 
portion of a PKS. 

Thus, the hybrid modules of the invention are incorporated into a PKS to provide a 
1 0 hybrid PKS of the invention. A hybrid PKS of the invention can result not only: 

(i) from fusions of heterologous domain (where heterologous means the domains in 
that module are from at least two different naturally occurring modules) coding sequences 
to produce a hybrid module coding sequence contained in a PKS gene whose product is 
incorporated into a PKS, 

1 5 but also: 

(ii) from fusions of heterologous module (where heterologous module means two 
modules are adjacent to one another that are not adjacent to one another in naturally 
occurring PKS enzymes) coding sequences to produce a hybrid coding sequence contained 
in a PKS gene whose product is incorporated into a PKS, 

20 (iii) from expression of one or more FK-520 PKS genes with one or more non-FK- 

520 PKS genes, including both naturally occurring and recombinant non-FK-520 PKS 
genes, and 

(iv) from combinations of the foregoing. 
Various hybrid PKSs of the invention illustrating these various alternatives are described 

25 herein. 

Examples of the production of a hybrid PKS by co-expression of PKS genes from 
the FK-520 PKS and another non-FK-520 PKS include hybrid PKS enzymes produced by 
coexpression of FK-520 and rapamycin PKS genes. Preferably, such hybrid PKS enzymes 
are produced in recombinant Streptomyces host cells that produce FK-520 or FK-506 but 
30 have been mutated to inactivate the gene whose function is to be replaced by the rapamycin 
PKS gene introduced to produce the hybrid PKS. Particular examples include (i) 
replacement of ihe JkbC gene with the rapB gene; and (ii) replacement of the JkbA gene with 
the rapC gene. The latter hybrid PKS produces 13,15-didesmethoxy-FK-520, if the host cell 
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is an FK-520 producing host cell, and 13,15-didesmethoxy-FK-506, if the host cell is an 
FK-506 producing host cell. The compounds produced by these hybrid PKS enzymes are 
immunosuppressants and neurotrophins but can be readily modified to act only as 
neurotrophins, as described in Example 6, below. 
5 Other illustrative hybrid PKS enzymes of the invention are prepared by replacing the 

fkbA gene of an FK-520 or FK-506 producing host cell with a hybrid fkbA gene in which: 

(a) the extender module 8 through 10, inclusive, coding sequences have been replaced by 
the coding sequnces for extender modules 12 to 14, inclusive, of the rapamycin PKS; and 

(b) the module 8 coding sequences have been replaced by the module 8 coding sequence of 
10 the rifamycin PKS. When expressed with the other, naturally occurring FK-520 or FK-506 

PKS genes and the genes of the modification enzymes, the resulting hybrid PKS enzymes 
produce, respectively, (a) 13-desmethoxy-FK-520 or 13-desmethoxy-FK-506; and (b) 13- 
desmethoxy-13-methyl-FK-520 or 13-desmethoxy-13-methyl-FK-506. In a preferred 
embodiment, these recombinant PKS genes of the invention are introduced into the 
1 5 producing host cell by a vector such as pHU204, which is a plamsid pRM5 derivative that 
has the well-characterized SCP2* replicon, the colEl replicon, the tsr and bla resistance 
genes, and a cos site. This vector can be used to introduce the recombinant fkbA 
replacement gene in an FK-520 or FK-506 producing host cell (or a host cell derived 
therefrom in which the endogenous fkbA gene has either been rendered inactive by 

20 mutation, deletion or homologous recombination with the gene that replaces it) to produce 
the desired hybrid PKS. 

In constructing hybrid PKSs of the invention, certain general methods may be 
helpful. For example, it is often beneficial to retain the framework of the module to be 
altered to make the hybrid PKS. Thus, if one desires to add DH and ER functionalities to a 

25 module, it is often preferred to replace the KR domain of the original module with a KR, 
DH, and ER domain-containing segment from another module, instead of merely inserting 
DH and ER domains. One can alter the stereochemical specificity of a module by 
replacement of the KS domain with a KS domain from a module that specifies a different 
stereochemistry. See Lau et al., 1999, "Dissecting the role of acyltransferase domains of 

30 modular polyketide synthases in the choice and stereochemical fate of extender units," 

Biochemistry 38(5): 1643- 1651, incorporated herein by reference. Stereochemistry can also 
be changed by changing the KR domain. Also, one can alter the specificity of an AT 
domain by changing only a small segment of the domain. See Lau et aL, supra. One can 
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also take advantage of known linker regions in PKS proteins to link modules from two 
different PKSs to create a hybrid PKS. See Gokhale et al, 16 Apr. 1999, "Dissecting and 
Exploiting Intermodular Communication in Polyketide Synthases," Science 284: 482-485. 
incorporated herein by reference. 
5 The following Table lists references describing illustrative PKS genes and 

corresponding enzymes that can be utilized in the construction of the recombinant PKSs and 
the corresponding DNA compounds that encode them of the invention. Also presented are 
various references describing tailoring enzymes and corresponding genes that can be 
employed in accordance with the methods of the present invention. 
10 Avermectin 

U.S. Pat. No. 5,252,474 to Merck. 

MacNeil et aL* 1993, Industrial Microorganisms: Basic and Applied Molecular 
Genetics , Baltz, Hegeman, & Skatrud, eds. (ASM), pp. 245-256, A Comparison of the 
Genes Encoding the Polyketide Synthases for Avermectin, Erythromycin, and Nemadectin. 
15 MacNeil et a/., 1992, Gene 115: 1 19-125, Complex Organization of the 

Streptomyces avermititis genes encoding the avermectin polyketide synthase. 

Ikeda et at., Aug. 1999, Organization of the biosynthetic gene cluster for the 
polyketide anthelmintic macrolide avermectin in Streptomyces avermitilis, Proc. Natl. Acad. 
ScL USA 96: 9509-9514. 
20 Candicidin (FR008) 

Hu et al„ 1994, MoL Microbiol 14: 163-172. 
Epothilone 

U.S. Pat. App. Serial No. 60/130,560, filed 22 April 1999. 
Erythromycin 
25 PCT Pub. No. 93/13663 to Abbott. 

US Pat. No. 5,824,513 to Abbott. 
Donadio et at., 1991, Science 252:675-9. 

Cortes et a/., 8 Nov. 1990, Nature 348: 176-8, An unusually large 
multifunctional polypeptide in the erythromycin producing polyketide synthase of 
30 Saccharopolyspora erythraea. 

Glycosvlation Enzymes 
PCT Pat. App. Pub. No. 97/23630 to Abbott. 
FK-506 
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Motamedi et aU 1998, The biosynthetic gene cluster for the macrolactone nng of 
the immunosuppressant FK-506, Eur. J. biochem. 256: 528-534. 

Motamedi et aL* 1997, Structural organization of a multifunctional polyketide 
synthase involved in the biosynthesis of the macrolide immunosuppressant FK-506, Eur. 7. 
5 Biochem. 244: 74-80. 

Methvltransferase 

US 5,264,355, issued 23 Nov. 1993, Methylating enzyme from 
Strepiomyces MA6858. 31-O-desmethyl-FK-506 methyltransferase. 

Motamedi et aL. 1996, Characterization of methyltransferase and 
10 hydroxylase genes involved in the biosynthesis of the immunosuppressants FK-506 and FK- 
520, y. BacterioL 178: 5243-5248. 
Strepiomyces hygroscopicus 

U.S. patent application Serial No. 09/154,083, filed 16 Sep. 1998. 
Lovastatin 

1 5 U.S. Pat. No. 5,744,350 to Merck. & 

Narbomycin 

U.S. patent application Serial No. 60/107,093, filed 5 Nov. 1998, and Serial No. 
60/120,254, filed 16 Feb. 1999. 
Nemadectin 
20 MacNeil et ul. 9 1993, supra. 

INiddamycin 

Kakavas et <a/., 1997, Identification and characterization of the niddamycin 
polyketide synthase genes from Streptomyces caelestis, J. BacterioL 179: 7515-7522. 
Oleandomycin 

25 Swan et al. n 1994, Characterisation of a Streptomyces antibioticus gene encoding a 

type I polyketide synthase which has an unusual coding sequence, Moi Gen. Genet. 242: 
358-362. 

U.S. patent application Serial No. 60/120,254, filed 16 Feb. 1999. 
Olano et a/., 1 99S. Analysis of a Streptomyces antibioticus chromosomal region 
30 involved in oleandomycin biosynthesis, which encodes two glycosyltransferases responsible 
for glycosylaiion of the macrolactone ring, Moi Gen. Genet. 259(3): 299-308. 
Picromycin 

PCT patent application US99/15047, filed 2 Jul. 1999. 
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Xue et a/., 1998, Hydroxy lation of macrolactones YC-17 and narbomycin is 
mediated by the /?2*C-encoded cytochrome P450 in Streptomyces Venezuelan Chemistry & 
Biology 5(11): 661-667. 

Xue ei a/., Oct. 1998, A gene cluster for macrolide antibiotic biosynthesis in 
5 Streptomyces venezuelae: Architecture of metabolic diversity, Proc. NatL Acad. Sci. USA 
95: 12111 12116. 
Platenolide 

EP Pat. App. Pub. No. 791,656 to Lilly. 
Rapamycin 

1 0 Schwecke et al. y Aug. 1995, The biosynthetic gene cluster for the 

polyketide rapamycin, Proc. Natl. Acad. Sci. USA P2:7839-7843. 

Aparicio et al.^ 1996, Organization of the biosynthetic gene cluster for rapamycin in 

Streptomyces hygroscopicus: analysis of the enzymatic domains in the modular polyketide 

synthase. Gene 169:9-16. 
1 5 Rifamycin 

August et a/., 13 Feb. 1998, Biosynthesis of the ansamycin antibiotic rifamycin: 
deductions from the molecular analysis of the rz/biosynthetic gene cluster of Amycolatopsis 
mediterranei S669, Chemistry & Biology. 5(2): 69-79. 
Sorangium PKS 

20 U.S. patent application Serial No. 09/144,085. filed 31 Aug. 1998. 

Sorapben 

U.S. Pat. No. 5,716,849 to Novartis. 

Schupp et aL, 1995, J. Bacteriology 1 77: 3673-3679. A Sorangium cellulosum 
(Mycobacterium) Gene Cluster for the Biosynthesis of the Macrolide Antibiotic Soraphen 
25 A: Cloning, Characterization, and Homology to Polyketide Synthase Genes from 
Actinomycetes. 
Spiramycin 

U.S. Pat. No. 5,098,837 to Lilly. 

Activator Gene 
30 U.S. Pat. No. 5,514,544 to Lilly. 

Tylosin 

EP Pub. No. 791,655 to Lilly. 

U.S. Pat. No. 5,876,991 to Lilly. 
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Kuhstoss etal. % 1996. Gene 1 83:23 1 -6.. Production of a novel polyketide through 
the construction of a hybrid polyketide synthase. 
Tailoring enzymes 

Merson-Davies and Cundliffe, 1994, MoL Microbiol. 13: 349-355. Analysis of five 
5 tylosin biosynthetic genes from the tylBA region of the Streptomyces fradiae genome. 

As the above Table illustrates, there are a wide variety of polyketide synthase genes 
that serve as readily available sources of DNA and sequence information for use in 
constructing the hybrid PKS-encoding DNA compounds of the invention. Methods for 
constructing hybrid PKS-encoding DNA compounds are described without reference to the 
10 FK-520 PKS in PCT patent publication No. 98/51695; U.S. Patent Nos. 5,672,491 and 
5,712,146 and U.S. patent application Serial Nos. 09/073,538, filed 6 May 1998, and 
09/141,908, filed 28 Aug 1998, each of which is incorporated herein by reference. 

The hybrid PKS-encoding DNA compounds of the invention can be and often are 
hybrids of more than two PKS genes. Moreover, there are often two or more modules in the 
1 5 hybrid PKS in which all or part of the module is derived from a second (or third) PKS. 

Thus, as one illustrative example, the present invention provides a hybrid FK-520 PKS that 
contains the naturally occurring loading module and FkbP as well as modules one, two, 
four, six. seven, and eight, nine, and ten of the FK-520 PKS and further contains hybrid or 
heterologous modules three and five. Hybrid or heterologous module three contains an AT 
20 domain that is specific of methyimalonyl CoA and can be derived for example, from the 

erythromycin or rapamycin PKS genes. Hybrid or heterologous module five contains an AT 
domain that is specific for malonyl CoA and can be derived for example, from the r 
picromycin or rapamycin PKS genes. 

While an important embodiment of the present invention relates to hybrid PKS 
25 enzymes and corresponding genes, the present invention.also provides recombinant FK-520 
PKS genes in which there is no second PKS gene sequence present but which differ from 
the FK-520 PKS gene by one or more deletions. The deletions can encompass one or more 
modules and/or can be limited to a partial deletion within one or more modules. When a 
deletion encompasses an entire module, the resulting FK-520 derivative is at least two 
30 carbons shorter than the gene from which it was derived. When a deletion is within a 

module, the deletion typically encompasses a KR. DH, or ER domain, or both DH and ER 
domains, or both KR and DH domains, or all three KR, DH, and ER domains. 
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To construct a hybrid PKS or FK-520 derivative PKS gene of the invention, one can 
employ a technique, described in PCT Pub. No. 98/27203 and U.S. patent application Serial 
No. OS/989.332, filed 1 1 Dec. 1997, each of which is incorporated herein by reference, in 
which the large PKS gene is divided into two or more, typically three, segments, and each 
5 segment is placed on a separate expression vector. In this manner, each of the segments of 
the gene can be altered, and various altered segments can be combined in a single host cell 
to provide a recombinant PKS gene of the invention. This technique makes more efficient 
the construction of large libraries of recombinant PKS genes, vectors for expressing those 
genes, and host cells comprising those vectors. 
10 Thus, in one important embodiment, the recombinant DNA compounds of the 

invention are expression vectors. As used herein, the term expression vector refers to any 
nucleic acid that can be introduced into a host cell or ceil-free transcription and translation 
medium. An expression vector can be maintained stably or transiently in a cell, whether as 
part of the chromosomal or other DNA in the cell or in any cellular compartment, such as a 
1 5 replicating vector in the cytoplasm. An expression vector also comprises a gene that serves 
to produce RN A that is translated into a polypeptide in the cell or cell extract. Furthermore, 
expression vectors typically contain additional functional elements, such as resistance- 
conferring genes to act as selectable markers. 

The various components of an expression vector can vary widely, depending on the 
20 intended use of the vector. In particular, the components depend on the host cell(s) in which 
the vector will be used or is intended to function. Vector components for expression and 
maintenance of vectors in E. coli are widely known and commercially available, as are 
vector components for other commonly used organisms, such as yeast cells and 
Streptomyces cells. 

25 In a preferred embodiment, the expression vectors of the invention are used to 

construct recombinant Streptomyces host cells that express a recombinant PKS of the 
invention. Preferred Streptomyces host cell/vector combinations of the invention include S. 
coelicolor CH999 and S. lividans K4-1 14 host cells, which do not produce actinorhodin, 
and expression vectors derived from the pRMl and pRM5 vectors, as described in U.S. 

30 Patent No. 5,830,750 and U.S. patent application Serial Nos. 08/828,898, filed 31 Mar. 
1997, and 09/181,833. filed 28 Oct. 1998, each of which is incorporated herein by 
reference. 
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The present invention provides a wide variety of expression vectors for use in 
Streptomyces, For replicating vectors, the origin of replication can be, for example and 
without limitation, a low copy number vector, such as SCP2* (see Hopwood et a/.. Genetic 
Manipulation of Streptomyces: A Laboratory manual (The John Innes Foundation, 
5 Norwich, U.K., 1985); Lydiate et aL y 1985, Gene 35: 223-235; and Kieser and Melton, 
1988, Gene 65: 83-91, each of which is incorporated herein by reference), SLPL2 
(Thompson et aL* 1 982, Gene 20: 5 1-62, incorporated herein by reference), and SG5(ts) 
(Muth et al^ 1989, Moi Gen, Genet 219: 341-348, and Bierman et a/., 1992, Gene 1 16: 43- 
49, each of which is incorporated herein by reference), or a high copy number vector, such 
10 as pIJlOl and pJVl (see Katz et a/., 1983, J. Gen. Microbiol 129: 2703-2714; Vara et aL, 
1989,7. Bactenol. 171: 5782-5781; and Servin-Gonzalez, 1993, Plasmid 30: 131-140, each 
of which is incorporated herein by reference). Generally, however, high copy number 
vectors are not preferred for expression of genes contained on large segments of DNA. For 
non-replicating and integrating vectors, it is useful to include at least an E coli origin of 
1 5 replication, such as from pUC, plP, pi I, and pBR. For phage based vectors, the phages 
phiC31 and KC5 15 can be employed (see Hopwood et al. 9 supra). 

Typically, the expression vector will comprise one or more marker genes by which 
host cells containing the vector can be identified and/or selected. Useful antibiotic resistance 
conferring genes for use in Streptomyces host cells include the ermE (confers resistance to 
20 erythromycin and other macrolides and lincomycin), tsr (confers resistance to thiostrepton). 
aadA (confers resistance to spectinomycin and streptomycin), aacC4 (confers resistance to 
apramycin, kanamycin, gentamicin, geneticin (G418), and neomycin), hyg (confers 
resistance to hygromycin), and vph (confers resistance to viomycin) resistance conferring 
genes. 

25 The recombinant PKS gene on the vector will be under the control of a promoter, 

typically with an attendant ribosome binding site sequence. The present invention provides 
the endogenous promoters of the FK-520 PKS and related biosynthetic genes in 
recombinant form, and these promoters are preferred for use in the native hosts and in 
heterologous hosts in which the promoters function. A preferred promoter of the invention 

30 is the fkbO gene promoter, comprised in a sequence of about 270 bp between the start of the 
open reading frames of the fkbO and fkbB genes. The fkbO promoter is believed to be bi- 
directional in that it promotes transcription of the genes jkhO. fkbP. and fkbA in one 
direction and fkbB.fkhC, and fkbL in the other. Thus, in one aspect, the present invention 
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provides a recombinani expression vector comprising the promoter of the JkbO gene of an 
FK-520 producing organism positioned to transcribe a gene other than JkbO. In a preferred 
embodiment the transcribed gene is an FK-520 PKS gene. In another preferred embodiment, 
the transcribed gene is a gene that encodes a protein comprised in a hybrid PKS. 
5 Heterologous promoters can also be employed and are preferred for use in host cells 

in which the endogenous FK-520 PKS gene promoters do not function or function poorly. A 
preferred heterologous promoter is the actl promoter and its attendant activator gene actll- 
ORF4 n which is provided in the pRMl and pRM5 expression vectors, supra. This promoter 
is activated in the stationary phase of growth when secondary metabolites are normally 
1 0 synthesized. Other useful Streptomyces promoters include without limitation those from the 
ermE gene and the melCl gene, which act constitutively, and the tipA gene and the merA 
gene, which can be induced at any growth stage. In addition, the T7 RNA polymerase 
system has been transferred to Streptomyces and can be employed in the vectors and host 
cells of the invention. In this system, the coding sequence for the T7 RNA polymerase is 
1 5 inserted into a neutral site of the chromosome or in a vector under the control of the 
inducible merA promoter, and the gene of interest is placed under the control of the T7 
promoter. As noted above, one or more activator genes can also be employed to enhance the 
activity of a promoter. Activator genes in addition to the actH-ORF4 gene discussed above 
include dnrl, redD, and ptpA genes (see U.S. patent application Serial No. 09/181,833, 
20 supra) to activate promoters under their control. 

In addition to providing recombinant DNA compounds that encode the FK-520 
PKS, the present invention also provides DNA compounds that encode the ethylmalonyl 
CoA and 2-hydroxymalonyl CoA utilized in the synthesis of FK-520. Thus, the present 
invention also provides recombinant host cells that express the genes required for the 
25 biosynthesis of ethylmalonyl CoA and 2-hydroxymalonyl CoA. Figures 3 and 4 show the 
location of these genes on the cosmids of the invention and the biosynthetic pathway that 
produces ethylmalonyl CoA. 

For 2-hydroxymalonyl CoA biosynthesis, the JkbH,JkbI % JkbJ, and JkbK genes arc 
sufficient to confer this ability on Streptomcyces host cells. For conversion of 2- 
30 hydroxymalonyl to 2-methoxymalonyl, the JkbG gene is also employed. While the complete 
coding sequence for fkbH is provided on the cosmids of the invention, the sequence for this 
gene provided herein may be missing a T residue, based on a comparison made with a 
similar gene cloned from the ansamitocin gene cluster by Dr. H. Floss. Where the sequence 
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herein shows one T, there may be two, resulting in an extension of the fkbH reading frame 
to encode the amino acid sequence: 

MTIVKCLVWDLDNTLWRGTVLEDDEVVLTDEIREVITTLDDRGILQAVASKNDHD 

LAWERLERLGVAEYFVLARIGWGPKSQSVREIATELNFAPTTIAFIDDQPAERAEVA 

5 FHLPEVRCYPAEQAATLLSLPEFSPPVSTVDSRRRRLMYQAGFARDQAREAYSGPD 

EDFLRSLDLSMTIAPAGEEELSRVEELTLRTSQMNATGVHYSDADLRALLTDPAHE 

VLVVTMGDRFGPHGAVGIILLEKKPSTWHLKLLATSCRVVSFGAGATILNWLTDQG 

AJIAGAIILVADFRRTDRNRMMEIAYRFAGFAJ^SDCPCVSEVAGASAAGVERLHLEP 
SARPAPTTLTLTAADIAPVTVSAAG. 

1 0 For ethylmaionyl CoA biosynthesis, one requires only a crotony I CoA reductase, 

which can be supplied by the host cell but can also be supplied by recombinant expression 

of ihe fkbS gene of the present invention. To increase yield of ethylmaionyl CoA, one can 

also express the fkbE and fkbU genes as well. While such production can be achieved using 

only the recombinant genes above, one can also achieve such production by placing into the 

1 5 recombinant host cell a large segment of the DNA provided by the cosmids of the invention. 

Thus, for 2-hydroxymalonyI and 2-methoxymalonyl CoA biosynthesis, one can simply 

provide the cells with the segment of DNA located on the left side of the FK-520 PKS genes 

shown in Figure 1. For ethylmaionyl CoA biosynthesis, one can simply provide the cells 

with the segment of DNA located on the right side of the FK-520 PKS genes shown in 

20 Figure 1 or, alternatively, both the right and left segments of DNA. 

The recombinant DNA expression vectors that encode these genes can be used to 

construct recombinant host cells that can make these important polyketide building blocks 

from cells that otherwise are unable to produce them. For example, Streptomyces coelicolor 

and Streptomyces lividans do not synthesisze ethylmaionyl CoA or 2-hydroxymalonyl CoA. 

25 The invention provides methods and vectors for constructing recombinant Streptomvces 

coelicolor and Streptomyces lividans that are able to synthesize either or both ethylmaionyl 

CoA and 2-hydroxymalonyl CoA. These host cells are thus able to make polyketides, those 

requiring these substrates, that cannot otherwise be made in such cells. 

In a preferred embodiment, the present invention provides recombinant 

30 Streptomyces host cells, such as S. coelicolor and S. lividans, that have been transformed 

with a recombinant vector of the invention that codes for the expression of the ethylmaionyl 

CoA biosynthetic genes. The resulting host cells produce ethylmaionyl CoA and so arc 

preferred host cells for the production of polyketides produced by PKS enzymes that 
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comprise one or more AT domains specific for ethylmalonyl CoA. Illustrative PKS 
enzymes of this type include the FK-520 PKS and a recombinant PKS in which one or more 
AT domains is specific for ethylmalonyl CoA. 

In a related embodiment, the present invention provides Streptomyces host cells in 
5 which one or more of the ethylmalonyl or 2-hydroxymalonyl biosynthetic genes have been 
deleted by homologous recombination or rendered inactive by mutation. For example, 
deletion or inactivation of the fkbG gene can prevent formation of the methoxyl groups at C- 
13 and C- 15 of FK-520 (or, in the corresponding FK-506 producing cell, FK-506), leading 
to the production of 13,1 5-didesmethoxy- 13,1 5-dihydroxy-FK-520 (or, in the 
10 corresponding FK-506 producing cell, 13,1 5-didesmethoxy- 1 3, 15-dihydroxy-FK-506). If 
ihefkbG gene product acts on 2-hydroxymalonyl and the resulting 2-methoxymalonyl 
substrate is required for incorporation by the PKS, the AT domains of modules 7 and 8 may 
bind malonyl CoA and methylmalonyl CoA. Such incorporation results in the production of 
a mixture of polyketides in which the methoxy groups at C-13 and C-15 of FK-520 (or FK- 
1 5 506) are replaced by either hydrogen or methyl. 

This possibility of non-specific binding results from the construction of a hybrid 
PKS of the invention in which the AT domain of module 8 of the FK-520 PKS replaced the 
AT domain of module 6 of DEBS. The resulting PKS produced, in Streptomyces lividans, 
6-dEB and 2-desmethyl-6-dEB, indicating that the AT domain of module 8 of the FK-520 
20 PKS could bind malonyl CoA and methylmalonyl CoA substrates. Thus, one could possibly 
also prepare the 13,15-didesmethoxy-FK-520 and corresponding FK-506 compounds of the 
invention by deleting or otherwise inactivating one or more or all of the genes required for 
2-hydroxymalonyl CoA biosynthesis, i.e., the JkbH^Jkbl.JkbJ, and JkbK genes. In any 
event, the deletion or inactivation of one or more biosynthetic genes required for 
25 ethylmalonyl and/or 2-hydroxymalonyl production prevents the formation of polyketides 
requiring ethylmalonyl and/or 2-hydroxymalonyl for biosynthesis, and the resulting host 
cells are thus preferred for production of polyketides that do not require the same. 

The host cells of the invention can be grown and fermented under conditions known 
in the art for other purposes to produce the compounds of the invention. See, e.g., U.S. 
30 Patent Nos. 5,194,378; 5,1 16,756; and 5,494,820, incorporated herein by reference, for 
suitable fermentation processes. The compounds of the invention can be isolated from the 
fermentation broths of these cultured cells and purified by standard procedures. Preferred 
compounds of the invention include the following compounds: 13-desmethoxy-FK-506; 13- 
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desmethoxy-FK-520; 13,15-didesmethoxy-FK-506; 13,15-didesmethoxy-FK-520; 13- 
desmethoxy- 1 8-hydroxy-FK-506; 1 3-desmethoxy- 1 8-hydroxy-FK-520; 13,15- 
didesmethoxy-lS-hydroxy-FK-506; and 13J5-didesmcthoxy-18-hydroxy-FK-520. These 
compounds can be further modified as described for tacrolimus and FK-520 in U.S. Patent 
5 Nos. 5,225,403; 5,189,042; 5,164,495; 5,068,323; 4,980,466; and 4,920.218, incorporated 
herein by reference. 

Other compounds of the invention are shown in Figure 8, Parts A and B. In Figure 8, 
Part A, illustrative C-32-substituted compounds of the invention are shown in two columns 
under the heading R. The substituted compounds are preferred for topical administration 
1 0 and are applied to the dermis for treatment of conditions such as psoriasis. In Figure 8, Part 
B, illustrative reaction schemes for making the compounds shown in Figure 8, Part A, are 
provided. In the upper scheme in Figure 8. Part B, the C-32 substitution is a tetrazole 
moiety, illustrative of the groups shown in the left column under R in Figure 8, Part A. In 
the lower scheme in Figure 8, Part B, the C-32 substitution is a disubstituted amino group, 
1 5 where R 3 and R4 can be any group similar to the illustrative groups shown attached to the 
amine in the right column under R in Figure 8, Part A. While Figure 8 shows the C-32- 
substituted compounds in which the C-15-methoxy is present, the invention includes these 
C-32-substituted compounds in which C-15 is ethyl, methyl, or hydrogen. Also, while C-21 
is shown as substituted with ethyl or allyl, the compounds of the invention includes the C- 
20 32-substituted compounds in which C-21 is substituted with hydrogen or methyl. 

To make these C-32-substituted compounds, Figure 8, Part B, provides illustrative 
reaction schemes. Thus, a selective reaction of the starting compound (see Figure 8, Part B, 
for an illustrative starting compound) with trifluoromethanesulfonic anhydride in the 
presence of a base yields the C-32 O-triflate derivative, as shown in the upper scheme of 
25 Figure 8, Part B. Displacement of the triflate with lH-tetrazole or triazole derivatives 

provides the C-32 tetrazole or teiazole derivative. As shown in the lower scheme of Figure 
8, Part B, reacting the starting compound with p-nitrophenylchloroformate yields the 
correspoinding carbonate, which, upon displacement with an amino compound, provides the 
corresponding carbamate derivative. 
30 The compounds can be readily formulated to provide the pharmaceutical 

compositions of the invention. The pharmaceutical compositions of the invention can be 
used in the form of a pharmaceutical preparation, for example, in solid, semisolid. 'or liquid 
form. This preparation contains one or more of the compounds of the invention as an active 
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ingredient in admixture with an organic or inorganic carrier or excipient suitable for 
external, enteral, or parenteral application. The active ingredient may be compounded, for 
example, with the usual non-toxic, pharmaceutical^ acceptable earners for tablets, pellets, 
capsules, suppositories, solutions, emulsions, suspensions, and any other form suitable for 
5 use. Suitable formulation processes and compositions for the compounds of the present 

invention are described with respect to tacrolimus in U.S. Patent Nos. 5,939,427; 5,922,729; 
5,385,907; 5,338,684; and 5,260,301, incorporated herein by reference. Many of the 
compounds of the invention contain one or more chiral centers, and all of the stereoisomers 
are included within the scope of the invention, as pure compounds as well as mixtures of 
1 0 stereoisomers. Thus the compounds of the invention may be supplied as a mixture of 
stereoisomers in any proportion. 

The carriers which can be used include water, glucose, lactose, gum acacia, gelatin, 
mannitol, starch paste, magnesium trisilicate, talc, corn starch, keratin, colloidal silica, 
potato starch, urea, and other carriers suitable for use in manufacturing preparations, in 
1 5 solid, semi-solid, or liquified form. In addition, auxiliary stabilizing, thickening, and 

coloring agents and perfumes may be used. For example, the compounds of the invention 
may be utilized with hydroxypropyl methylcellulose essentially as described in U.S. Patent 
No. 4,916,138, incoiporated herein by reference, or with a surfactant essentially as 
described in EPO patent publication No. 428,169, incorporated herein by reference. 
20 Oral dosage forms may be prepared essentially as described by Hondo et a/., 1987, 

Transplantation Proceedings XIX, Supp. 6: 1 7-22, incorporated herein by reference. Dosage 
forms for external application may be prepared essentially as described in EPO patent 
publication No. 423,714, incorporated herein by reference. The active compound is included 
in the pharmaceutical composition in an amount sufficient to produce the desired effect 
25 upon the disease process or condition. 

For the treatment of conditions and diseases relating to immunosuppression or 
neuronal damage, a compound of the invention may be administered orally, topically, 
parenterally, by inhalation spray, or rectally in dosage unit formulations containing 
conventional non-toxic pharmaceutical^ acceptable carriers, adjuvant, and vehicles. The 
30 term parenteral, as used herein, includes subcutaneous injections, and intravenous, 
intramuscular, and intrasternal injection or infusion techniques. 

Dosage levels of the compounds of the present invention are of the order from about 
0.01 mg to about 50 mg per kilogram of body weight per day, preferably from about 0. 1 mg 



SUBSTITUTE SHEET (RULE 26) 



BNSDOCID: <WO 0020601 A2 I A* 



10 



WO 00/20601 PCT/US99/22886 

75 

to about 10 mg per kilogram of body weight per day. The dosage levels are useful in the 
treatment of the above-indicated conditions (from about 0.7 mg to about 3.5 mg per patient 
per day, assuming a 70 kg patient). In addition, the compounds of the present invention may 
be administered on an intermittent basis, i.e., at semi-weekly, weekly, semi-monthly, or 
monthly intervals. 

The amount of active ingredient that may be combined with the carrier materials to 
produce a single dosage form will vary depending upon the host treated and the particular 
mode of administration. For example, a formulation intended for oral administration to 
humans may contain from 0.5 mg to 5 g of active agent compounded with an appropriate 
and convenient amount of carrier material, which may vary from about 5 percent to about 
95 percent of the total composition. Dosage unit forms will generally contain from about 0.5 
mg to about 500 mg of active ingredient. For external administration, the compounds of the 
invention can be formulated within the range of, for example, 0.00001% to 60% by weight, 
preferably from 0.001% to 10% by weight, and most preferably from about 0.005% to 0.8% 
1 5 by weight. The compounds and compositions of the invention are useful in treating disease 
conditions using doses and administration schedules as described for tacrolimus in U.S. 
Patent Nos. 5,542,436; 5,365,948; 5,348,966; and 5,196,437, incorporated herein by 
reference. The compounds of the invention can be used as single therapeutic agents or in 
combination with other therapeutic agents. Drugs that can be usefully combined with 
20 compounds of the invention include one or more immunosuppressant agents such as 
rapamycin, cyclosporin A, FK-506, or one or more neurotrophic agents. 

It will be understood, however, that the specific dosage level for any particular 
patient will depend on a variety of factors. These factors include the activity of the specific 
compound employed; the age, body weight, general health, sex, and diet of the subject; the 
25 time and route of administration and the rate of excretion of the drug; whether a drug 
combination is employed in the treatment; and the severity of the particular disease or 
condition for which therapy is sought. 

A detailed description of the invention having been provided above, the following 
examples are given for the purpose of illustrating the present invention and shall not be 
30 construed as being a limitation on the scope of the invention or claims. 

Example 1 

Replacement of Methoxyl with Hydrogen or Methyl at C-13 of FK-520 
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The C-13 methoxyl group is introduced into FK-520 via an AT domain in extender 
module 8 of the PKS that is specific for hydroxymalonyl and by methylation of the 
hydroxyl group by an S-adenosyl methionine (SAM) dependent methyltransferase. 
Metabolism of FK-506 and FK-520 primarily involves oxidation at the C-13 position into 
5 an inactive derivative that is further degraded by host P450 and other enzymes. The present 
invention provides compounds related in structure to FK-506 and FK-520 that do not 
contain the C-13 methoxy group and exhibit greater stability and a longer half-life in vivo. 
These compounds are useful medicaments due to their immunosuppressive and 
neurotrophic activities, and the invention provides the compounds in purified form and as 
1 0 pharmaceutical compositions. 

The present invention also provides the novel PKS enzymes that produce these 
novel compounds as well as the expression vectors and host cells that produce the novel 
PKS enzymes. The novel PKS enzymes include, among others, those that contain an AT 
domain specific for either malonyl CoA or methylmalonyl CoA in module 8 of the FK-506 
1 5 and FK-520 PKS. This example describes the construction of recombinant DNA 

compounds that encode the novel FK-520 PKS enzymes and the transformation of host cells 
with those recombinant DNA compounds to produce the novel PKS enzymes and the 
polyketides produced thereby. 

To construct an expression cassette for performing module 8 AT domain 
20 replacements in the FK-520 PKS, a 4.6 kb Sphl fragment from the FK-520 gene cluster was 
cloned into plasmid pLitmus 38 (a cloning vector available from New England Biolabs). 
The 4.6 kb Sphl fragment, which encodes the ACP domain of module 7 followed by module 
8 through the KR domain, was isolated from an agarose gel after digesting the cosmid 
pKOS65-C3 1 with Sph I. The clone having the insert oriented so the single Sad site was 
nearest to the Spel end of the polylinker was identified and designated as plasmid pKOS60- 
21-67. To generate appropriate cloning sites, two linkers were ligated sequentially as 
follows. First, a linker was ligated between the Spel and Sacl sites to introduce a Bgfll site 
at the 5' end of the cassette, to eliminate interfering polylinker sites, and to reduce the total 
insen size to 4.5 kb (the limit of the phage KC515). The ligation reactions contained 5 
picomolar unphosphorylated linker DNA and 0. 1 picomolar vector DNA, i.e., a 50-fold 
molar excess of linker to vector. The linker had the following sequence: 

5 '-CTAGTGGGCAGATCTGGCAGCT-3 ' 
3 ''ACCCGTCTAGACCG-5' 

The resulting plasmid was designated pKOS60-27- 1 . 
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Next, a linker of the following sequence was ligated between the unique Sphl and 

AJlll sites of plasmid pKOS60-27-l to introduce an Nsil site at the 3' end of the module 8 

cassette. The linker employed was: 

5 '-GGGATGCATGGC-3 ' 
5 3 '-GTACCCCTACGTACCGAATT-5 ' 

The resulting plasmid was designated pKOS60-29-55. 

To allow in-frame insertions of alternative AT domains, sites were engineered at the 
5* end (Avr II or Nhe I) and 3' end (Xho I) of the AT domain using the polymerase chain 
reaction (PCR) as follows. Plasmid pKOS60-29-55 was used as a template for the PCR and 
10 sequence 5' to the AT domain was amplified with the primers SpeBgl-fwd and either Avr- 
rev or Nhe-rev: 

SpeBgl-fwd 5 '-CGACTCACTAGTGGGC AGATCTGG-3 * 
A vr-re v 5 ' -C ACGCCTAGGCCGGTCGGTCTCGGGCC AC-3 * 
Nhe-rev 5 '-GCGGCTAGCTGCTCGCCCATCGCGGGATGC-3 * 
1 5 The PCR included, in a 50 \i\ reaction, 5 jil of lOx Pfu polymerase buffer 

(Stratagene), 5 nl lOx z-dNTP mixture (2 mM dATP, 2 mM dCTP, 2 mM dTTP, 1 mM 
dGTP, 1 mM 7-deaza-GTP), 5 \il DMSO, 2 \l\ of each primer (10 jiM), 1 ^1 of template 
DNA (0.1 ng/|il), and 1 ^il of cloned Pfu polymerase (Stratagene). The PCR conditions 
were 95°C for 2 min., 25 cycles at 95°C for 30 sec, 60°C for 30 sec, and 72°C for 4 min., 
20 followed by 4 min. at 72°C and a hold at 0°C. The amplified DNA products and the Litmus 
vectors were cut with the appropriate restriction enzymes (Bglll and Avrll or Spel and 
Nhel) % and cloned into either pLitmus 28 or pLitmus38 (New England Biolabs), 
respectively, to generate the constructs designated pKOS60-37-4 and pKOS6G-37-2, 
respectively. 

25 Plasmid pKOS60-29-55 was again used as a template for PCR to amplify sequence 

3* to the AT domain using the primers BsrXho-fwd and NsiAfl-rev: 

BsrXho-fwd 5 , 'GATGTACAGCTCGAGTCGGCACGCCCGGCCGCATC-3 > 
NsiAfl-rev 5 '-CG ACTCACTTAAGCCATGC ATCC-3 ' 

PCR conditions were as described above. The PCR fragment was cut with BsrGl 
30 and AJIIL gel isolated, and ligated into pKOS60-37-4 cut with Asp7lS and AJlll and inserted 
into pKOS60-37-2 cut with BsrGl and A/Ill to give the plasmids pKOS60-39-l and 
pKOS60-39-l 3, respectively. These two plasmids can be digested with Avrll and A7iol or 
Nhel and Xlioh respectively, to insert heterologous AT domains specific for malonyl, 
methylmalonyl, ethylmalonyl, or other extender units. 
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Malonyl and methyimalonyl-specific AT domains were cloned from the rapamycin 
cluster using PCR amplification with a pair of primers that introduce an Avrll or Nhel site at 
the 5' end and an Xhol site at the 3* end. The PCR conditions were as given above and the 
primer sequences were as follows: 

5 

RATN 1 5 * - ATCCT AGGCGGGCRGG YGTGTCGTCCTTCGG-3 ' 
(3* end of Rap KS sequence and universal for malonyl and methylmalonyl CoA) 
RATMN2 5 '-ATGCTAGCCGCCGCGTTCCCCGTCTTCGCGCG-3 ' 
(Rap AT shorter version 5'- sequence and specific for malonyl CoA), 
1 0 RATMMN2 5 ATGCTAGCGGATTCGTCGGTGGTGTTCGCCG A-3 ' 

(Rap AT shoner version 5'- sequence and specific for methylmalonyl CoA), and 
RATC 5 ATCTCG AGCCAGTASCGCTGGTG YTGGAAGG-3 * 
(Rap DH 5'- sequence and universal for malonyl and methylmalonyl CoA). 
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DH 







MMN2 - Nhel 
N 1 - AvrU: MN2 - Nhel 

IS C I 1 AT I I T~"\ T T 

Any Rap Module 

Xhol-C 

10 Because of the high sequence similarity in each module of the rapamycin cluster. 

each primer was expected to prime any of the AT domains. PCR products representing ATs 
specific for malonyl or methylmalonyl extenders were identified by sequencing individual 
cloned PCR products. Sequencing also confirmed that the chosen clones contained no 
cloning artifacts. Examples of hybrid modules with the rapamycin AT12 and AT13 

1 5 domains sire shown in a separate figure. 

The Avrll-Xhol restriction fragment that encodes module 8 of the FK-520 PKS with 
the endogenous AT domain replaced by the AT domain of module 12 of the rapamycin PKS 
has the DNA sequence and encodes the amino acid sequence shown below. The AT of rap 
module 12 is specific for incorporation of malonyl units. 

20 AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 50 
IWQLAEALLTLVREST 
GCCGCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 

AAVLGHVGGEDI PATAA 
GTTCAAGGACCTCGGCATCGACTCGCTCACCGCGGTCCAGCTGCGCAACG 150 
25 FKDLGIDSLTAVQLRN 

CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 200 
ALTEATGVRLNATAVFD 
TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 2 50 
FPTPHVLAGKLG-DELTG 
30 CACCGGCGCGCGCGTCGTGCCGGGGACGGCGGGCAGGGCCGGTGCGCACG 300 
T RA P V V P RTAA TAG A H 
ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 
DEPLAI VGMACRLPGGV 
GCGTCACCCGAGGAGC7GTGGCACCTCGTGGCATCCGGCACCGACGCCAT 4 00 
35 AS PEELWHLVASGTDAI 

CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 4 50 

TEFPTDRGWDVDAI YD 
CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 
PDPDAIGKTFVRHGGFL 
40 ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 550 
T • G AT G FDAAFFG I S P RE 
GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 
ALAMDPQQRVLLETSW 
■ AGGCGTTCGAAAGCGCCGGCATCACCCCGGACTCGACCCGCGGCAGCGAC 650 
45 EAFESAGITPDSTRGSD 

ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGG7GCGGA 7 00 

? G V F V Z A F S Y G Y G T 3 A D 
CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 7 50 
T DG-FG AT G S Q 7 > S V L S G 
50 GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 3 00 
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? - -3 ':' ~ Y G L £ G P A V T 7 Z> T 
GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTGGCTGCG 350 

^ 3LVALHQ A GQ3 L ?. 

' ' ~---.7G 77C3C7CGCCC7GG7CGGCGGCG7CACGG73A.7GGCG7 90C 

SLALVGGVTV'MA 
rCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
~ ?GG - r VEFSRQRGLAPD 
GGCCCGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 

G R A K" A F G A G A D G T S F A E 
GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGCAACG 1050 

GAGVLIVERLSDA£RN 
GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 
GHTV1AVVRGSAVNQDG 
GCC7CCAACGGGC7G7CGGCGCCGAACGGGCCG7CGCAGGAGCGGGTGAT 1150 
15 ASM GLSAPNGPSQERVI 

CCGGCAGGGCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 

RQALANAGLTPADVDA 
TCG AGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 12 50 
VEAHGTGTRLGDPIEAQ 
GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 

"VLATYGQERAT PLLLG 
CTCGCTGAAGTCGAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 13 50 
S 1 K S N I G H A Q A A S G V A 

"A7CA7CAAGA7GG7GCAGGCCC7CCGGCACGGGGAGC7GCCGCCGACG 14 00 
- 5 S I I K M. V Q A L R H G E L P P T 

CTGCACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 14 50 

LHADEPSPHVDWTAGAV 
CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCTAGGC 1500 
EL L T SAR PWPE7 DR PR 
30 GGGCAGGCGTGTCGTCCTTCGGGATCAGTGGCACCAACGCCCACGTCATC 1550 
RAGVSSFGISGTNAHVI 
CTGGAAAGCGCACCCCCCACTCAGCCTGCGGACAACGCGGTGATCGAGCG 1600 

L E 5 A P PTQ PADNAV I.ER 
GGCACCGGAGTGGGTGCCGTTGGTGATTTCGGCCAGGACCCAGTCGGCTT 1650 
35 APEWVPLVISARTQSA 

TGACTGAGCACG AGGGCCGGTTGCGTGCGTATCTGGCGGCGTCGCCCGGG 17 00 
L 7 E H EGRLRAYLAAS PG 
G7GGA7A7GCGGGC7GTGGCA7CGACGC7GGCGA7GACACGG7CGG7G7T 17 50 
"-DMRAVASTLAMTRSVF 
40 CGAGCACCG7GCCG7GC7GC7GGGAGA7GACACCG7CACCGGCACCGC7G 18 00 
EH ? A V LLG D / DT V TG TA 
TGTCTGACCCTCGGGCGGTGTTCGTCTTCCCGGG ACAGGGGTCGCAGCGT 18 50 

S 3 ? R A V F V F P G Q G S Q R 
GCTGGCATGGGTGAGGAACTGGCCGCCGCGTTCCCCGTCTTCGCGCGGAT 1 900 
45 A G M G EELAAAFPVFARI 

CCATCAGCAGGTGTGGGACCTGCTCGATGTGCCCGATCTGGAGGTGAACG 1 950 

HQQVWD. LLDVPDLEVN 
AGACCGGTTACGCCCAGCCGGCCCTGTTCGCAATGCAGGTGGCTCTGTTC 2000 
ETG Y AQPALFAMQVALF 
50 GGGCTGCTGGAATCGTGGGGTGTACGACCGGACGCGGTGATCGGCCATTC 2050 
GLL ESWGVRPDAVIGHS 
GGTGGGTGAGCTTGCGGCTGCGTATGTGTCCGGGGTGTGGTCGTTGGAGG 2100 

VGELAAAYVSGVW5LE 
ATGCCTGCACTTTGGTG7CGGCGCGGGCTCGTCTGATGCAGGCTCTGCCC 2150 
55 7; AC 7 LVSARA'RLMQALP 

CCGGGTGGGG7GA7GGTCGCTG7CCCGGTCTCGGAGGATGAGGCCCGGGC 2200 

A G G 7 X V AVPV£E"E A R A 
- o * -j-w . oovj i or^G^jGTGTGGAGATCGCCGCGGTCAACGGCCCGTCGTCGG 22 5 0 
VLGEGVEIAA V K G P 3 S 
60 TCG77C7C7CCGG7GA7GAGGCCGCCGTGC7GCAGGCCGCGGAGGGGC7C 2 300 
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"•' v ^ S G C- E A A V - Q A A E Z L 
GGGAAGTGGACGCGGC7CGCGACCA.GCCACGCGTTCCAT7CCGCCCG7A7 2 3 50 
G K W 7 R L A 7 S H A ~ H S A ? M 
AAC'_CATGC i GGAGGAGTTCCGGGCGGTGG 7CGAAGGCC7GAT7C7A.7C I I 30 
5 E ? M h E E F P. A V A E G L 7 V 

GGACGCCGCAGG7CTCCATGGCCGTTGGTGATCAGGTGACCACCGCTGAG 2 4 50 

7 P Q V 5 M A V G D Q V T T A E 
TACTGGGTGCGGCAGGTCCGGGACACGGTCCGGTTCGGCGAGCAGGTGGC 2 500 
V W V R Q V R D T V R f G E Q V A . 
1 0 CTCGTACGAGGACGCCGTGTTCGTCGAGCTGGGTGCCGACCGGTCACTGG 2550 
SYEDAVFVELGADRSL 
CCCGCCTGGTCGACGGTGTCGCGATGCTGCACGGCGACCACGAAATCCAG 2 600 
ARLVDGVAMLHGDHEIQ 
GCCGCGATCGGCGCCCTGGCCCACCTGTATGTCAACGGCGTCACGGTCGA 2 650 
15 AAIGALAHLYVNGVTVD 

CTGGCCCGCGCTCCTGGGCGATGCTCCGGCAACACGGGTGCTGGACCTTC 27 00 

WPALLGDAPATRVLDL 
CGACATACGCCTTCCAGCACCAGCGCTACTGGCTCGAGTCGGCACGCCCG 27 50 
PTYAFQHQRYWLESARP 
20 GCCGCATCCGACGCGGGCCACCCCGTGCTGGGCTCCGGTATCGCCCTCGC 28 00 
AASDAGKPVLGSGIALA 
CGGGTCGCCGGGCCGGGTG7TCACGGGTTCCGTGCCGACCGGTGCGGACC 2 3 50 

G S P G R V F 7 . G S V ? 7 G A D 
GCGCGGTGT7CGTCGCCGAGC7GGCGCTGGCCGCCGCGGACGCGG7CGAC 2 900 
25 RAVFVAELALAAADAVD 

TGCGCCACGGTCGAGCGGCTCGACATCGCCTCCGTGCCCGGCCGGCCGGG 2 950 

CA7VERLDIASVPGRPG 
CCA7GGCCGGACGACCG7ACAGACC7GGG7CGACGAGCCGGCGGACGACG 3000 
HGR77VQ7WVDEPADD 
30 GCCGGCGCCGG77CACCG7GCACACCCGCACCGGCGACGCCCCG7GGACG 3050 
GRRRF7VH7RT G DAPW7 
C7GCACGCCGAGGGGG7GC7GCGCCCCCA7GGCACGGCCC7GCCCGA7GC 3100 

LHAEGVLRPHGTALPDA 
GGCCGACGCCGAGTGGCCCCCACCGGGCGCGGTGCCCGCGGACGGGCTGC 3150 
-35 ADAEWPPPGAVPADGL 

CGGG7G7G7GGCGCCGGGGGGACCAGG7C7TCGCCGAGGCCGAGG7GG AC 3200 
PGVWRRGDQVFAEAEVD 
GGACCGGACGG777CG7GG7GCACCCCGACC7GC7CGACGCGG7C77C7C 32 50 
GPDGFVVHPDLLDAVFS 
40 CGCGGTCGGCGACGGAAGCCGCCAGCCGGCCGGATGGCGCGACCTGACGG 3 300 
AVG DG S R Q PAG W RD L 7 
7GCACGCG7CGGACGCCACCG7AC7GCGCGCC7GCC7CACCCGGCGCACC 3 350 
V HAS DA7VL RAC L7 R R 7 
GACGGAGCCA7GGGA77CGCCGCC77CGACGGCGCCGGCC7GCCGG7AC7 34 00 
45 DGAMG FAAFDGAGLPVL 

CACCGCGGAGGCGGTGACGCTGCGGGAGGTGGCGTCACCGTCCGGCTCCG 3 4 5 0 

7AEAV7LREVASPSGS 
AGGAG7CGGACGGCC7GCACCGG77GGAG7GGC7CGCGG7CGCCGAGGCG 3 500 
EESDGLHRLEWLAVAEA 
50 G7C7ACGACGG7GACC7GCCCGAGGGACA7G7CC7GA7CACCGCCGCCCA 3550 
VYDGDLPEGHVLI7AAH 
CCCCGACGACCCCGAGGACA7ACCCACCCGCGCCCACACCCGCGCCACCC 3 600 

PDDPEDIPTR'AKTRAT 
GCG7CC7GACCGCCC7GCAACACCACC7CACCACCACCGACCACACCC7C 3650 
55 ?. VLTALQHHLTTTDHTL 

A7CG7CCACACCACC ACCGACCCCGCCGGCGCCACCGTCACCGGCCT CAC 3700 

I V H T 7 T D ? A G A. 7 V T G 1 7 
- — o^j-iv-v-v-c^^w AoAACGAACACCCCCA»CCGC.—.7CC ~ CC7CA7CGrv-».-vwWw j, : z ~ 
RTAQNEHPHRIRLIE.T 
60 ACCACCCCCACACCCCCC7CCCCC7GGCCCAAC7CGCCACCC7CGACCAC 3 3 00 
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£ H ? H - ? - ? L A Q L A T L D H 
CCCCACCTCCGCCTCACCCACCACACCCTCCACCACCCCCACCTCACCCC 38 5 0 
~ !L„ H L ? L T H H T L H H ? K L T P 
CrTCCACACCACCACCCCACCCArCACCACCCCCCTCA-ACCCCGAACACG 3 900 

lh t :tf?7tt?lnpeh 

CCATCATCATCACCGGCGGCTCCGGCACCCTCCCCGGCATCCTCGCCCGC 3 95C 
A T - - T G G S G 7 L A G ; L A R 
CACCTGAACCACCGCCACACCTACCTCCTCTCCCGCACCCCACCCCCCGA 4 000 
; - : L N H ? H T Y L L S R T P P P 'J 

CGCCACCCCCGGCACCCACCTCCCCTGCGACGTCGGCGACCCCCACCAAC 4 050 

AT P GTHL PCDVGDPHQ 
TCGCCACCACCCTCACCCACATCCCCCAACCCCTCACCGCCATCTTCCAC 4100 
I-ATTLTH I PQPLTAIFH 
ACCGCCGCCACCCTCGACGACGGCATCCTCCACGCCCTCACCCCCGACCG 4150 
15 TAATLDDG I LHALTPDR 

CCTCACCACCGTCCTCCACCCCAAAGCCAACGCCGCCTGGCACCTGCACC 4 200 

LTTVLHPKANAAWHLH 
ACCTCACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCC 4 2 50 
HLTQNQPLTHFVLYSSA 
GCCGGCGTCCTCGGCAGCCCCGGACAAGGAAACTACGCCGCCGCCAACGC 4 300 

A A V L G S PGQGNYAAANA 
CTTCCTCGACGCCGTCGCCACCCACCGCCACACCCTCGGCCAACCCGCCA 4 350 

TLDALAT K RHTLGQPA 
CCTCCATCGCCTGGGGCATGTGGCACACCACCAGCACCCTCACCGGACAA 4 4 00 
25 7 S I A W G M W HTTSTLTGQ 

CTCGACGACGCCGACCGGGACCGCATCCGCCGCGGCGGTTTCCTCCCGAT 4 4 50 

LDDADRDRIRRGGTLPI 
CACGGACGACGAGGGCATGGGGATGCAT 
7 D D E G 



20 



30 



The Avrll-Xhol restriction fragment that encodes module 8 of the FK-520 PKS with 
the endogenous AT domain replaced by the AT domain of module 13 (specific for 
methylmalonyl CoA) of the rapamycin PKS has the DNA sequence and encodes the amino 
acid sequence shown below. 



->5 AGATCTGGCAGCTCCCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 50 
CLAEALLTLVR^EST 
GCCGCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCCACCGCGGC 100 

AAVLGHVGG'ED.I PATAA 
G77CAAGGACC7CGGCATCGACTCGCTCACCGCGGTCCAGC7GCGCAACG 150 
40 FKDLGI DSLTAVQLRN 

CCCTCACCG AGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 2 00 
ALTEATGVRLNATAVFD 
TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 2 5 0 
FPTrHVLAGKLGDELTG 
45 CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 300 
TRAPVVPRTAATAGAH 
ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 3 50 
DEPLAIVGMACRLPGGV 
GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 4 00 
50 AS PEELWH LVASGTDAI 

CACGGAGTTCCCGACGGACCGCGGC7GGGACGTCGACGCGATCTACGACC 4 50 

TEFPTDRGWDVDAIYD 
CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 

? » ? o a : g y 7 r v ?. h g g f l 

^5 ACCGGCGCGACAGGC7TCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 5 50 
TGATGFDA. AFFG I S PRE 
GGCCC7CGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 60C 
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^ - a :-: ? 0 Q R v L L £ ; 5 w 

AGGCGTTCGAA-AGCGCCGGCATCACCCCGGAC7CGACCCGCGGCACCGAC 650 
^ A E £ A G T T P D 3 7 R G 3 D 

- ; _ CG-^7- - *-T7377 73GCG7777C73GTACGG77A73GGAGC3G7GGGGA 70C 
3 7 G V F 7 G A F S Y G V G 7 G A D 

CACCGACGGC7TCG3CGCGACCGGC7CGCAGACCAG7G7GC7C7CCGGCC 7 50 

T G G r" G A T G S Q T 3 V 1 S G 
GGCTG77GTAG7777ACGGTGTGGAGGGTCGGGGG37GAGGG7CGACACG 300 
i- 5 V T Y G L E G P A 7 T V D T 
10 GCGTG 77 CG TC G7G G 3 TGGTGGGGCTGC AC CAGGCCGGGCAGTCGCTGCG 8 50 
AC5S5LVALHQAGQSLR* 
CTCCGGGGAATGC7CGGTGGCCCTGGTCGGCGGGGTCACGGTGATGGCGT 900 

SGECSLALVGGVTVMA 
CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
15 S PGG FVEFSRQRGLAPD 

GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 

GRAKAFGAGADGTSFAE 
GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGCAACG 1050 
GAGVL IVERLSDAERN 
20 GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 
GH7VLAVVRGSA7NQDG 
GCCTCCAACGGGC7GTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 
ASNG15APNGPSQERVI 

CCGGCAGGCCC7CGCCAACGCCGGGC7CACCCCGGC3GACG7GGACGCCG 1200 
25 R QALANAGLTPADVDA 

7CGAGGCCCACGGCACCGGCACCAGGC7GGGCGACCCCATCGAGGCACAG 12 50 
V E A H G 7 G 7 R L G D r I E A Q 
GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 
AVLAT YGQERA7PLLLG 
30 C7CGC7GAAGTCCAACATCGGCCACGCCCAGGCCGCG7CCGGCGTCGCCG 1350 
SLKS N IGHAQAASGVA 
GCA7CA7CAAGATGG7GCAGGCCCTCCGGCACGGGGAGC7GCCGCCGACG 14 00 
GI I KMVQALRHGELP PT 
CTGCACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 14 50 
35 LHADE PSPHVDWTAGAV 

CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCTAGGC 1500 

ELLTSARPWPETDRPR 
GGGCGGGCGTGTCGTCC7TCGGAGTCAGCGGCACCAACGCCCACGTCATC 1550 
RA GVS 3 FGVSG7NAHVI 
40 7TGGAGAGCGGACGGGCCGGTCAGGCCGCGGAGGAGGGGCAGCGTGTTGA 1600 
LESAr PAQPAEEAQPVE 
GACGCCGG7GG7GGCC7CGGA7C7GC7GCCGC7GGTGA7A7CGGCCAAGA 1650 

T ? V V A S D V L P L V I , S A K 
7CCAGCCCCCCC7GACGGAACACGAAGACCGGC7GCGCGCC7ACC7GGCG 17 00 
45 7QFA1TEHEDRLRAYLA 

GGGTGGGGGGGGGGGG AT AT ACGGGGTGTGGCATCGAGGCTGGGGGTGAG 17 50 

AS PGADI RAVASTLAVT 
AC.GG7CGG7G77CGAGCACCGCGCCGTACTCCT7GGAGATGACACCG7CA 1800 
RSVFEHRAVLLGDDTV 
50 GCGGCACCGCGGTGACCGACCCCAGGATCGTG7TTGTCTTTCCCGGGCAG 18 50 
T G T A V T D P R I V F V F P G Q 
GGGTGGCAG7GGC7GGGGA7GGGCAG7GCAC7GCGCGA77CG7CGG7GG7 1 900 

GWQW1GMGSALRDS3VV- 
GTTCGCCGAGCGGATGGCCGAGTGTGCGGCGGCGTTGCGCGAGTTCGTGG 1950 
55 F A E R X A E C A A A L R E F V 

ACTGGGATCTG777ACGG7TCTGGATGATCCGGGGGTGGTGGAGCGGGTT 2000 
- W 7 L F 7 V D D P A Y V 7 R V 

3A7G7GG7CCAGGGGGG7TGGTGGGCGATGA7GGT77CGC7GGGCGCGGT 2050 
D V V Q ? A S W A M M V S L A A V 
60 37GGCAGGCGGCC3G7C7GCGGCCGGA7GCGG7GA7CGGCCA77CGCAGG 2100 
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WQAAGVRPDAVI G H £ Q 
GTGAGATCGCCGCPiGCTTGTGTGGCGGGTGCGGTGTCACTACGCGATGCC 2150 
G£IAAACVAGAVSL?. DA 

wLuvj.-. i ^ ^ i GACC77GCGCAGCCAGGCGATCCCCCGGGGCC73GCG 3G 2 ^0^ 
A R - V T 1 R S Q A I A R G 1 A G 

CCGGGGCGCGATGGCATCCGTCGCCCTGCCCGCGCAGGATGTCGA3CTGG 2250 
R G A M A S V A L P A Q D V Z L 

TCGACGGGGCCTGGATCGCCGCCCACAACGGGCCCGCCTCCACCGTGATC 2 300 
V 0 G A W I A A H N G P A S 7 7 I 

GCGGGCACCCCGGAAGCGGTCGACCATGTCCTCACCGCTCATGAGGCACA 2 350 

AGTPEAVDHVLTAHIAQ 
AGGGGTGCGGGTGCGGCGGATCACCGTCGACTATGCCTCGCACACCCCGC 2 4 00 

GVRVRRITVDYASHTP 
ACGTCGAGCTGATCCGCGACGAACTACTCGACATCACTAGCGACAGCAGC 24 50 
15 HV£LIRDELLDI-TSDSS 

TCGCAGACCCCGCTCGTGCCGTGGCTGTCGACCGTGGACGGCACCTGGGT 2500 

SQTPLVPWLSTVDGTWV 
CGACAGCCCGCTGGACGGGGAGTACTGGTACCGGAACCTGCGTGAACCGG 2550 
DSPLDGEYWYRNLREP 
20 TCGGTTTCCACCCCGCCGTCAGCCAGTTGCAGGCCCAGGGCGACACCGTG 2 600 
VGFHPAVSQLQAQGDT.V 
TTCGTCGAGGTCAGCGCCAGCCCGGTGTTGTTGCAGGCGATGGACGACGA 2 650 

FVEVSAS PVLLQAMDDD 
TGTCGTCACGGTTGCCACGCTGCGTCGTGACGACGGCGACGCCAGCCGGA 2700 
25 VVTVAT LRRDDG DA.TR 

TGCTCACCGCCCTGGCACAGGCCTATGTCCACGGCGTCACCGTCGACTGG 27 50 
MLTALAQAYVHGVTVDW 
CCCGCCATCCTCGGCACCACCACAACCCGGGTACTGGACCTTCCGACCTA 2800 
FAX LGTTTTRVLDLc.TY 
30 CGCCTTCCAACACCAGCGGTACTGGCTCGAGTCGGCACGCCCGGCCGCAT 2850 
AFQHQRYWLESARPAA 
CCGACGCGGGCCACCCCGTGCTGGGCTCCGGTATCGCCCTCGCCGGGTCG 2 900 
SDAGH PVLGSGIALAGS 
CCGGGCCGGGTGTTCACGGGTTCCGTGCCGACCGGTGCGGACCGCGCGGT 2 950 
35 PGRVFTGSVPTGADRAV 

GTTCGTCGCCGAGCTGGCGCTGGCCGCCGCGGACGCGGTCGACTGCGCCA 3000 

FVAELALAAADAVDCA 
CGGTCGAGCGGCTCGACATCGCCTCCGTGCCCGGCCGGCCGGGCCATGGC 3050 
TVERLDIASVPGRPGHG 
40 CGGACGACCGTACAGACCTGGGTCGACGAGCCGGCGGACGACGGCCGGCG 3100 
RTTVQTWVDEPADDGRR 
CCGGTTCACCGTGCACACCCGCACCGGCGACGCCCCGTGGACGC7GCACG 3150 

RFTVHTRTGDAPWTLH 
CCGAGGGGGTGCTGCGGCCCCATGGCACGGCCCTGCCCGATGCGGCCGAC 3200 
45 AEGVJJRPHGTALPDAAD 

GCCGAGTGGCCCCCACCGGGCGCGGTGCCCGCGGACGGGCTGCCGGG7GT 32 50 

AEWPPPGAVPADGLPGV 
GTGGCGCCGGGGGGACCAGGTCTTCGCCGAGGCCGAGGTGGACGGACCGG 3 300 
WRRGDQVFAEAEVDGP 
50 ACGGTTTCGTGGTGCACCCCGACCTGCTCGACGCGGTCTTCTCCGCGGTC 3 3 50 
D G FVVH PDLLDAV FSAV 
GGCGACGGAAGCCGCCAGCCGGCCGGATGGCGCGACCTGACGGTGCACGC 34 00 

GDGS RQPAGWRDLTVKA 
CTCGGACGCCACCGTACTGCGCGCCTGCCTCACCCGGCGCACCGACGGAG 34 50 
55 SDATVLRACLTRRTDG 

CCATGGGATTCGCCGCCTTCGACGGCGCCGGCCTGCCGGTACTCACCGCG 3500 
AMG FAAFDGAGL ? V L 7 A 
GAGGCGG7GACGCTGCGGGAGGTGGCGTCACCGTCCGGCTCCGAGGAGTC 3 5 50 
EAVTLREVASPSGSEES 
60 GGACGGCCTGCACCGG7TGGAGTGGCTCGCGGTCGCCGAGGCGG7CTACG 3 600 
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DGLHRLEWLAVASAVY 
ACGGTGACCTGCCCGAGGGACATGTCCTGATCACCGCCGCCCACCCCGAC 3650 
OGDLPEGHVLITAAHPD 
G.'.CCCCGAGGACATACCGACCCGCGCCCACACGCGCGCCACCCGCGTCCr 270C 
5 2 P Z D 1 PTRAHTRATRVL 

GACCGCCCTGCAACACCACCTCACCACCACCGACCACACCCTCATCGTCC 37 50 
T A L 0 H H L 7 T T D H T L I V 

ACACCACCACCGACCCCCCCGGCGCCACCG7CACCGGCC7CACCCGCACC 38 00 
H T T T D ? A G A T V T G 1 T R T 
10 GCCCAGAACGAACACCCCCACCGCATCCGCCTCATCGAAACCGACCACCC 38 50 
AQNEH ? H R I RLIE7DHP 
CCACACCCCCG7CCCCC7GGCCCAAC7CGCCACCC7CGACCACCCCCACC 3 900 

HTPLPLAQLATLDHPH 
TCCGCCTCACCCACCACACCCTCCACCACCCCCACCTCACCCCCCTCCAC 3 950 
15 LRLTH HTLHHPHLTPLH 

ACCACCACCCCACCCACCACCACCCCCCTCAACCCCGAACACGCCATCAT 4 000 

TTTPPTTTPLNPEHAII 
CATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCACCTGA 4 050 
I TGGSGTLAGI LARHL 
20 ACCACCCCCACACCTACCTCCTCTCCCGCACCCCACCCCCCGACGCCACC 4 100 
NHPHTYLLS RTPPPDAT 
CCCGGCACCCACCTCCCCTGCGACGTCGGCGACCCCCACCAACTCGCCAC 4150 

PGTHL PCDVGDPHQLAT 
CACCC7CACCCACATCCCCCAACCCCTCACCGCCATCTTCCACACCGCCG 4 200 
25 TL7H I PQPLTAI FHTA 

CCACCCTCGACG ACGGCATCG7CCACGCCC7C ACCCCCGACCGCCTCACC 4 250 
ATLDDGI LHALTPDRLT ^ ' 

ACCG7CCTCCACCCCAAAGCCAACGCCGCCTGGCACCTGCACCACCTCAC 4 300 
TVLHPKANAAWHLHHLT 
30 CCAAAACCAACCCCTCACCCACT7CGTCCTCTAC7CCAGCGCCGCCGCCG 4 3 50 
QNQPLTH FVLYSSAAA 
TCCTCGGCAGCCCCGGACAAGGAAACTACGCCGCCGCCAACGCCTTCCTC 4 4 00 
VLGS PGQGNYAAANAFL 
GACGCCCTCGCCACCCACCGCCACACCCTCGGCCAACCCGCCACCTCCAT 4 4 50 
35 DALATHRHTLGQPATSI 

CGCCTGGGGCA7GTGGCACACCACCAGCACCCTCACCGGACAACTCGACG 4 500 

AWGMW H T 7 S T L TGQL D 
ACGCCGACCGGGACCGCATCCGCCGCGGCGGTTTCCTCCCGATCACGGAC 4 550 
DA'ORDRI RRGG FLPITD — 
40 GACGAGGGCA7GGGGA7GCA7 
D E G 

The Nhell-XItol restriction fragment that encodes module 8 of the FK-520 PKS with 
the endogenous AT domain replaced by the AT domain of module 12 (specific for malonyl 
45 CoA) of the rapamycin PKS has the DNA sequence and encodes the amino acid sequence 
shown below. 

AGATCTGGCAGCTCGCCGAAGCGCTGC7GACGCTCGTCCGGGAGAGCACC 5 0 
QLAEALLTLVREST 
' GCCGCCG7GC7CGGCCACG7GGG7GGCGAGGACA7CCCCGCGACGGCGGC 100 
50 AAVLGHVGGEDI PATAA 

G77CAAGGACC7CGGCA7CGACTCGCTCACCGCGGTCCAGC7GCGCAACG 150 

FKDLG I DSL7AVQLRN 
CCC7CACCG AGGCGACCGG7GTGCGGC7GAACGCCACGGCGGTCTTCGAC 200 
.-.LIZ A T G 7 ?. L t! A 7 A V F D 
55 77CCCGACCCCGCACG7GC7CGCGGGGAAGCTCGGCGACGAACTGACCGG 2 50 
FPTPHVLAGKLGDELTG 
CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACCGCCGGTGCGCAC^ 300 
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trapvvprtaatagah 
acgagccgctggcgatcgtgggaatggcctgccggctccccggcggggtc 350 
deplaivgmacrl pggv 
c-cg7caccccaggagctgtggcacctcgtggcatccggcaccgacgccat 4 00 
- aspeelwhlvasgtdai 
cacggagttcccgacggaccgcggctgggacgtcgacgcgatctacga.ee 4 50 

~ E r ? T OR G W D V DA I Y Q 

C-3GACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 50C 
^ D P DA I G K T F V R H G G F L 
1 0 ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 550 
GAT GFD A AFFGISPRE 
GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 

ALAMDPQQRVLLETSW 
AGGCGTTCGAAAGCGCCGGCATCACCCCGGACTCGACCCGCGGCAGCGAC 650 
* 5 £AFE3AGI TPDSTRGSD 

ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 7 00 

TGVFVGAFSYGYGTGAD 
CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 750 
TDGFGATGSQTSVLSG 
20 GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 8 00 
RLS Y FYGLEGPAVTVDT 
CCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 8 50 

ACS SSLVA. LHQAGQSLR 
CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 
- 5 SGECSLALVGGVTVMA 

CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
SPGGFVEFSRQRGLAPD 
GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 
GRAKAFGAGADGTSFAE 
30 GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGCAACG 1050 
GAGVLIVERLSDAERN 
GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 
GHTVLAVVRGSAVNQDG 
GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 
35 ASNGLSAPNGPSQERVI 

CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGG ACGTGGACGCCG 1200 

3 QALANAGLTPADVDA 
T CGAGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 1250 
VEAHGTGTRLGDPIEAQ 
40 GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 
AVLATY'GQERATPLLLG 
CTCGCTGAAGTCCAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 

SL^KSNIGHAQAASGVA 
GCATCATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 14 00 
45 3IIKMVQALRHGELPPT 

CTGCACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 1 4 50 

1HADEPSPHVDWTAGAV 
CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCACGGC 1500 
ELLTSARPWPETORPR 
50 "GCCGCCGTCTCCTCCTTCGGGGTGAGCGGCACCAACGCCCACGTCATC 15 50 
R . A A V S S FG V SGTNAH V I 
CTGGAGGCCGGACCCGTAACGGAGACGCCCGCGGCATCGCCTTCCGGTGA 1 600 

1 E A G PVTET PAAS PSGD 
CC7TCCCCTGCTGG7GTCGGCACGCTCACCGGAAGCGCTCGACGAGCAGA 1650 
55 1PL .LVSARSPEALDEQ 

TCCGCCGACTGCGCCCCTACCTGGACACCACCCCGGACGTCGACCGGGTG 17 00 

r.RLR AYLDTTPDVDRV 
s.-^, -oTGGCACAGACGCTGGCCCGGCGCACACACTTCGCCCACCGCGCCGT 17 3 0 
A V A " Q T LA R R T H • F A H R A V 
60 GCTGCTCGGTGACACCGTCATCACCACACCCCCCGCGGACCGGCCCGACG 18-00 
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i :.. g z t v r t t p p a d ?. ? d 

AAC7CG7C77CG7C7AC7CCGGCCAGGGCACCCAGCA.7CCCGCGA.7GGGC 18 50 
E L 7 ~ V V S G Q G T Q H ? A M G 
GAGCAGC7AGC::GGGGCG77CGGCG7C7TCGGGCGGA7GGA7CAGCAGG7 1900 
5 E Q 1 A A A F P V F A R I H Q V 

G7GGGACC7GC7CGA7G7GCCCGA7CTGGAGGTGAAGGAGACC33TTACG 1950 

W D L L D V P D L E V N E T G Y 
CCCAGCCGGCCC7G7TCGCAATGCAGGTGGCTCT377CGGGC7GCTGGAA 2000 
A Q ? A 1 FAMQVALFG1LE 
1 0 7CG7GGGG7G7ACGACCCGACGCGG7GA7CGGCCA77CGG7GGG7GAGG7 2050 
S W G V ?. P D A V I G H S V G E L 
7GCGGC7GGG7A7G7G7CCGGGG7G7GG7CG77GGAG3A7GCC7GCAC77 2100 
A A A V V S G V W S L Z D A 7 7 

7GG7G7CGGCGCGGGC7CC7CTGATGCAGGC7C7G"CGCGGG73GGG7G 2150 
15 L V S A R A R L M Q A I. ? A G G V 

ATGG7CGCTC7CCCGGTCTCGGAGGATGAGGCCCG3 3CCG7GC7GGGTGA 2200 

M V A V ? V S E D E A R A V L G E 
GGGTG7GGAGA7CGCCGCGGTCAACGGCCCGTCGTC3GTGGT7CTCTCCG 2250 
GV E I AAV NGPS 3 VV LS' 
20 G7GA7GAGGCCGCCG7GC7GCAGGCCGCGGAGGGGC7GGGGAAG7GGACG 2 300 
GDEAAVLQAAEG1GKWT 
CGGC7GGCGACCA3CCACGCGTTCCAT7CCGCCCG7ATGGAACCCATGCT 2350 

R L A 7 S HAFHSAR MErML 
GGAGGAG77CCGGGCGG7CGCCGAAGGCC7GACC7A7CGGACGC7GCAGG 24 00 
25 EEFftAVAEGLTYRTPQ 

7C7CCA7GGCCG77GG7GA7CAGGTGACCACCGC7GAG7AC7GGG7GCGG 24 50 
VSMAVGDQVTTAIYWVR 
CAGGTCCGGGACACGGTCCGGTTCGGCGAGCAGGTGGCCTCGTACGAGGA 2500 
Q V R D 7 V R F G E Q 7 A S Y E D 
30 CGCCG7G77CG7CGAGC7GGG7GCCGACCGG7CAC7GGCCCGC77GG7CG 2550 
A V FVELGADRSLARLV 
ACGG7G7CGCGA7GC7GCACGGCGACCACGAAA7CCAGGCCGCGA7CGGC 2600 
DGV AM LHGDHEIQA A IG 
GCCC7GGCCCACC7G7A7C7CAACGGCG7CACGG7CGAC7GGGCCGCGC7 2650 
35 A L A H L Y V N G V • 7 7 D W ? A L 

CC7GGGCGA7GC7CCGGCAACACGGG7GC7GGACC77CCGACA7ACGCC7 2 7 00 

LGDAPA7RVLDZ_P7YA 
TCCAGCACCAGCGCTACTGGCTCGAGTCGGCACGCCC3GCCGCATCCGAC 27 50 
FQKQRYWLESARPAASD 
40 GCGGGCCACCGCG7GC7GGGC7CCGG7A7CGCCC7CGCCGGG7CGCCGGG 2800 
AGH PVLGSGI'ALAGS PG 
CCGGG7G77CACGGG77CCG7GCCGACCGG7GCGGACCGCGCGG7G7TCG 28 50 

RVF7GSVP7GAGRAVF 
7CGCCGAGC7GGCGC7GGCCGCCGCGGACGCGG7CGAC7GCGCCACGG7C 2 900 
45 7 A E L A L A A A D A V Z C A 7 V 

GAGCGGCTCGACATCGCCTCCGTGCCCGGCCGGCCGGGCCATGGCCGGAC 2 950 

ERLDIASVPGRPGH GR7 
GACCG7ACAGACC7GGG7CGACGAGCCGGCGGACGACGGCCGG7GCCGGT 3000 
7VQ7WVDEPADGGRRR 
50 7CACCG7GCACACCCGCACCGGCGACGCCCCG7GGAGGC7GCAGGCCGAG 3050 
F 7 V H 7 R T • G DA P W 7 L H A E 
GGGG7GC7GCGCCCCCA7CGCACGGCCCTGCCCGA7GCGGCCGACGCCGA 3100 

G V L R F H G 7 A L P Z A A : A E 
G7GGCCCCCACCGGGCGCGG7GCCCGCGGACGGGC7GGGGGG7G7G7GGC 3150 
5 5 W P P P G A V P A D G 1 P G 7 W 

GCCGGGGGGACCAGG7C77CGCCGAGGCCGAGGTGGACGGACCGGACGGT 3200 
R R G D Q V F A EAEVGGrOG 
- 7 GG 7 7 GCaC G G GGACC7GG7CGAGGCGG7C77 G7GGGCvj— . ^uvjCvjA o 2 5 0 
F V V* H P D L L D A V F S A 7 G D 
60 CGGAAGCCGCCAGCCGGCCGGATGGCGCGACCTGACGGTGCACSCGTCGG 3 300 
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~ : $ ?, Q ? A G W R D L T V H A S 
--•~3CCACCGTACTGCGCGCC7GCC7CACCCGGCGCACCGACGGAGCCATG 3 350 
?__2_,I„ V LRACLTRRTDGAM 
j -.- - - C G C C G C C7 7 C G?-,C3GCGC CGGCC7 C C C G G T A C T CACCCC GG AG G C 2 -1 00 

GFAAFDGAGLPVLTAEA 
GGTGACGCTGCGGGAGGTGGCGTCACCGTCCGGCTCCGAGGAG7CGGACG 34 50 

VT -R£VASPSGSEESD 
GCCTGCACCGGTTGGAGTGGCTCGCGGTCGCCGAGGCGG7CTACGACGGT 3500 
G L H RLSWLAVAEAVYDG- 
GACCTGCCCGAGGGACATGTCCTGATCACCGCCGCCCACCCCGACGACCC 3550 

2LPEGHVLITAAHPDDP 
CGAGGACATACCCACCCGCGCCCACACCCGCGCCACCCGCGTCCTGACCG 3 600 

-01 PTRAHTRATRVLT 
CCC7GCAACACCACC7CACCACCACCGACCACACCCTCATCG7CCACA.ee 3 650 
15ALQHHLTTTDHTLIVHT 

ACCACCGACCCCGCCGGCGCCACCGTCACCGGCCTCACCCGCACCGCCCA 3 7 00 

T TDPAGATVTGLTRTAQ 
GAJ^CGAACACCCCCACCGCATCCGCCTCATCGAAACCGACCACCCCCACA 37 50 

^ E H PHRIRLIETDHPH 
:CCCCC7CCCCCTGGCCCAACTCGCCACCCTCGACCACCCCCACCTCCGC 3800 
'-'?LPLAQLATLDHPHLR 
C7CACCCACCACACCC7CCACCACCCCCACC7CACCCCCCTCCACACCAC 38 50 

-THHTLHH-PHLTPLHTT 
C ACCCCACCCACC ACCACCCCCCTCAACCCCGAACACGCC ATCATCATC A 3 900 
25 T P P TTT PLNPEHAI I I 

CCGGCGGC7CCGGCACCCTCGCCGGCATCCTCGCCCGCCACCTGAACCAC 3950 
TGGSGTLAG ILARHLNH 
CCCCACACCTACCTCCTCTCCCGCACCCCACCCCCCGACGCCACCCCCGG 4 000 

PHTYLLSRTPPPDATPG 
CACCCACCTCCCCTGCGACGTCGGCGACCCCCACCAACTCGCCACCACCC 4 050 

TH.LPCDVGDP HQLATT 
TCACCCACATCCCCCAACCCCTCACCGCCATCTTCCACACCGCCGCCACC 4100 
- T H I PQPLTAI FHTAAT 
C7CGACGACGGCATCC7CCACGCCCTCACCCCCGACCGCC7CACCACCGT 4 150 
35 L D 0 G I LHALTPDRLTTV 

CCTCCACCCCAAAGCCAACGCCGCCTGGCACCTGCACCACCTCACCCAAA 4 200 

LH PKANAAWHLHH LTQ 
A.CCAACCCCTCACCCACTTCGTCCTCTACTCCAGCCCCGCCGCCGTCCTC 4 2 50 
N 0 P L T H F V L Y S S A A A V L 
GGCAGCCCCGGACAAGGAAACTACGCCGCCGCCAACGCC77CCTCGACGC 4 300 

GS PGQGNYAAANAFLDA 
CCTCGCCACCCACCGCCACACCCTCGGCCAACCCGCCACCTCCATCGCCT 4 350 

LATHRHTLGQPATS IA 
GGGGCA7GTGGCACACCACCAGCACCC7CACCGGACAACTCGACGACGCC 4 4 00 
45 WGMWHTTSTLTGQLDDA 

GACCGGGACCGCATCCGCCGCGGCGGTTTCC7CCCGATCACGGACGACGA 4 4 50 
. D R D R I RRGGFLPITDDE 
GGGCATGGGGATGCA7 
G 



30 



40 



50 



The Nhell-XJwl restriction fragment that encodes module 8 of the FK-520 PKS with 
the endogenous AT domain replaced by the AT domain of module 13 (specific for 
methylmalonyl CoA) of the rapamycin PKS has the DNA sequence and encodes the amino 
acid sequence shown below. 



^ AGATC7GGCAGCTCGCCGAAGCGCTGCTGACGCTCG7CCGGGAGAGCACC 50 
OLA EALLTLV REST 
GCCGCCG7GCTCCGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 
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AAV 1 G ii V C 3 E G I ? A T A A 
C77CAAGGACCTCGGCA7CGAC7CGC7CACC3CGGTCCAGCTGCGCAACG 1 5 J 

K D L G I D S L7AVQLRN 
"J ? - T " .A C C G A G G C G A C C G G 7 3 7 3 G G G G 7 G A-A G 3 C C AC G G CG G 7 C 7 7 C G A C ZZZ 
5 A L T E A T G V R L M A T A V F D 

TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 

F?T?HVLAG KLGDELTG 
CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 30 0 
7 R. A P V V P R T A. A T A G A H 
1 0 ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 3 50 
DEPLAI VGMACRLPGGV 
GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 4 00 

AS PEELWH LVASGTDAI 
CACGGAGTTCCCGACGGACCGCGGC7GGGACGTCGACGCGATCTACGACC 4 50 
15 TEFPTDRGWDVDAIYD 

CGGACCCCGACGCGATCGGCAAGA.CCTTCGTCCGGCACGGTGGCTTCCTC 500 
PDPDAIGKTFVRHGGFL 
ACCGGCGCGACAGGCTTCGACGCGGCGTTC77CGGCATCAGCCCGCGCGA 5 50 
TGATG FDAAF "GI3PRE 
20 GGCCCTCGCGATGGACCCGCAGCAGCGGGTGC7CCTGGAGACGTCGTGGG .600 
ALAMDPQQRVLLETSW 
AG GCGTTCGAAAGCCCCCGCATCACCCCGGACTCGACCCGCGGCAGCGAC 65 0 
EAFESAG I T PDSTRGSD 
ACCGGCGTGT7CGTCGGCGCCT7GTCCTACGGTTACGGCACCGGTGCGGA 7 00 
25 TGVFVGAFS YGYGTGAD 

CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 7 50 

TDGFGATGSQTSVLSG ~ 
GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 800 
RLSYFYGLEG PAVTVDT 
30 GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 8 50 
ACSSSLVALHQAGQSLR 
CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 

SGECSLALVGGVTVMA 
CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCC7CGCGCCGGAC 950 
35 5 P.GG FV E FS RQRGLAPD 

GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 

GRAKAFGAGA DGTS FAE 
GGGTGCCGG7GTGCTGATCGTCGAGAGGC7C7CCGACGCCGAACGCAACG 1050 
G A G V 1 I V Z P. L 3 DAE R N 
40 G7CA GACCGTCC7GGCGGTCG7CCG7GG7TCGGCGGTCAACCAGGATGG7 1 1 C 2 
GHTVLAVVRGSAVNQDG 
GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 

ASNGLSAPNG PSQERVI 
CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 
45 RQALANAGLT PADVDA 

TCGAGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 12 50 
VEAHGTGTRLGDPIEAQ 
CCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCC7GCTGCTGGG 13 00 
AVLATYGQE RAT PLLLG 
50 CTCGCTGAAG7CCAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 
SLKSN I GHAQAASGVA 
GCA7CATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 1 4 OC 
G I I K M V Q A L . P. K G E L P P T 
CTGCACGCCGACGAGCCCTCGCCGCACGTCGACTGCACGGCCGGCGCCG7 14 50 
55 L, H A D E P S P K V 3 W T A G A V 

CGAA.C7GCTGA.CGTCGGCCCGGCCG7GGCCCGA.GACCGACCGGCCACGGC 153? 
~ L. G 7 8 A P ? v.' ? Z 7 n P ? P. 

R A A V. S S F G V 5 G T N A H V I 
60 G7GGAGGCCGGACCGGTAACGGAGACGCCCGCGGCATCGCCTTCCGG7GA 160 0 
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^ E A G ? V T E T P A A S P S -3 Z> 
CCTTCCCC7GC7GG7G7CGGCACGC7CACCGGAAGCGC7CGACGAGCAGA 1650 ■ 

1? ^-I-V5ARSPcALD£Q 
7 C C 3 C C G A.C 7 G C 3 C 3 C CT AC C 7 G G AC ACC AC C C C 3 G AC GT C G AC 1 7 CO 

3 - f<. R L £ A Y L D T T ? D V D R V 

GCCGTGGCACAGACGCTGGCCCGGCGCACACACTTCGCCCACCGCGCCGT 17 50 

AVAQTLARRTH FAHRAV 
GCTGCTCGGTGACACCGTCATCACCACACCCCCCGCGGACCGGCCCGACG 13 00 
LLGDTVITTPPADRPD 
1 0 AACTCGTCTTCGTCTACTCCGGCCAGGGCACCCAGCATCCCGCGATGGGC 13 50 
ELVFVYSGQG7QH PAMG 
GAGCAGCTAGCCGATTCGTCGGTGGTGTTCGCCGAGCGGATGGCCGAGTG 1900 

EQLADSSVVFA ERMAEC 
TGCGGCGGCG7TGCGCGAGTTCGTGGACTGGGATCTGTTCACGGTTCTGG 1950 
15 AAALREFVDWDL FTVL 

A7GA7CCGGCGGTGG7GGACCGGG77GA7G7GG7CCAGCCCGC77CC7GG 2000 
D-D PA V V DRV DVVQ PASW 
GCGATGATGGTTTCCCTGGCCGCGGTGTGGCAGGCGGCCGGTGTGCGGCC 2050 
AM MVS LAAVWQAAGVRP 
20 GGATGCGGTGATCGGCCATTCGCAGGGTGAGATCGCCGCAGCTTGTGTGG 2100 
DAVIGHSQGEIAAACV 
CGGG7CCGG7G7CAC7ACGCGA7GCCGCCCGGA7CG7GACC77GCGCAGC 215 0 
AGAVSLRDAAR: V 7 L R ' S 
CAGGCGA7CGCCCGGGGCC7GGCGGGCCGGGGCGCGA7GGCA7CCG7CGC 2200 
-5 QAIARG-LAGRGAMASVA 

CC7GCCCGCGCAGGA7G7CGAGCTGG7CGACGGGGCCTGGATCGCCGCCC 22 50 

L P AQ DVE L V DGAW I AA 
ACAACGGGCCCGCC7CCACCGTGATCGCGGGCACCCCGGAAGCGGTCGAC 2300 
HNGPAS7VIAG7 P E A V D 
30 CA7G7CC7CACCGC7CA7GAGGCACAAGGGG7GCGGGTGCGGCGGA7CAC 2350 
HVLTAHEAQGVRVRRI7 
CG7CGAC7A7GCC7CGCACACCCCGCACG7CGAGCTGATCCGCGACGAAC 2 4 00 

VDYASHTPHVELIRDE 
7 AC7CGAC A7CAC7AGCGACAGCAGCTCGCAGACCCCGCTCG7GCCG7GG 2 4 50 
35 1LDI 7SDSSSQ7PLVPW 

C7G7CGACCG7GGACGGCACC7GGG7CGACAGCCCGC7GGACGGGGAG7A 2 500 

LSTVDG7WVDSPLDGEY 
C7GG7ACCGGAACCTGCG7GAACCGG7CGGT77CCACCCCGCCG7CAGCC 2 5 50 
WYRNLREPVGtH PAVS 
40 AG77GCAGGCCCAGGGCGACACCG7G77CG7CGAGG7CAGCGCCAGCCCG 2 600 
Q 1QAQG D 7 V F,VE V S AS P 
GTG77G77GCAGGCGATGGACGACGATG7CG7CACGGT7GCCACGC7GCG 2 650 

L L QAM D D DVV7 V A 7 L R 
7CG7GACGACGGCGACGCCACCCGGA7GC7CACCGCCCTGGCACAGGCC7 2700 
45 RDDG DATRML7ALAQA 

A7G7CCACGGCG7CACCG7CGAC7GGCCCGCCA7CC7CGGCACCACCACA 2 7 50 
VVHGV7VDWPAI LG777 
ACCCGGG7AC7GGACCT7CCGACCTACGCC77CCAACACCAGCGG7AC7G 2 300 
7RVLDLPTYAFQHQRYW 
50 GC7CGAG7CGGCACGCCCGGCCGCA7CCGACGCGGGCCACCCCG7GC7GG 2 3 50 
LES-ARPAASDAGH PVL 
CC7CCGC7A7CGCCC7CGCCGGGTCGCCGGGCCCGG7GTTCACGGGT7CC 2 900 
GSG I A LAGS PGRVFTGS 
G7GCCGACCGGTGCGGACCGCGCGG7G77CGTCGCCGAGC7GGCGC7GGC 2 950 
55 P 7 G A D R A V F V A E L A L A 

CGCCGCGGACGCGG7CGAC7GCGCCACGG7CCAGCGGC7CGACA7CCCC7 3000 
A ADA V r C A 7 V E •» I. C ' A 
i s- ■w ^ j -j w w vj j C J CG G CC A C G C C GG AC 3 — . Z C J Z — .C AG AC C7 3 GG _ 3 ; 
2 V P G R P G H G R T T V Q T W V 
60 3ACGAGCCGGCGGACGACGGCCGGCGCCGG77CACCG7GCACACCCGCAC 3100 
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0 Z r 3 0 G R R R F T V K T R 7 

CGGCGACG G G G C GTCGACGCT 3CACGCCGAGGGGGT GCTCC GCCCCCATG 3130 
G C A ? W T L H A E G 7 L = ? H 
-- * — ; o _ . _ -. , — . o *_ o ^ o A-^ G C G G AG T G G G G C G C A C G G 3G C GC G 3 0 r 
$ G T A L = D A A D A E W ? P ? G A 

GTCCCCGGGGAG CGGCTGCCGGGTGTGTGGCGCCGGGGGGACCAGGTCTT 32 50 

v PACGLPGVWRRGDOVF 
CGCCGAGGCCGAGGTGGACGGACCGGACGGTTTCGTGGTGCACCCCGACC 3 300 
AEAEVDGPDGFVVHPD 
10 TGCTCGACGCGG7CTTCTCCGCGGTCGGCGACGGAAGCCGCCAGCCGGCC 3350 
LLDAVFSAVGDGSRQPA 
GGATGGCGCGACC7GACGGTGCACGCGTCGGACGCCACCGTAC7GCGCGC 34 00 

G W R DLTVHAS.DATVLRA 
CTCCCTCA.CCCGGCGCACCGACGGAGCCATGGGATTCGCCGCCTTCGACG 34 50 
15 CLTRRTDGAMGFAAFD 

CCGCCGGCCTGCCGGTACTCACCGCGGAGGCGGTGACGCTGCGGGAGGTG 3500 
GAG L PVLTAEAVTLREV 
GCGTCACC37CCGSCTCCGAGGAGTCGGACGGCCTGCACCGCT7GGAGTG 35 50 
AS PSGSEESDGLHRLEW 
20 GCTCGCGGTCGCCGAGGCGGTCTACGACGGTGACCTSCCCGAGGGACATG 3 600 
L A V A E A V Y D G D L P Z G H 
* C7CTGA.TC.-\CC G CCGCCCACCCCGACGACCCCGAGGACATACCCACCCGC 3 6 S 0 
V L Z 7 A A H P D D P E D I ? T R 
GCCCACACCCGGCCCACCCGCGTCCTGACCGCCCTGCAACACCACCTCAC 37 00 
25 AHTRATRVLTALQHHLT 

CACCACCG ACCACACCCTCATCGTCCACACCACCACCGACCCCGCCGGCG 37 50 

TTDHTLIVHTTTDPAG 
CCACCG7CACCGGCCTCACCCGCACCGCCCAGAACGAACACCCCCACCGC 3800 
ATVTGLTRTAQNEH PHR 
30 ATCCGCC7CATCGAAACCGACCACCCCCACACCCCCCTCCCCCTGGCCCA 38 50 
I RL I ETDHPH 'TPLPLAQ 
ACTCGCCACCC7CGACCACCCCCACCTCCGCCTCACCCACCACACCCTCC 3900 

LATLDHPHLRLTHHTL 
ACCACCCCCACCTCACCCCCCTCCACACCACCACCCCACCCACCACCACC 3 950 
■35 HHrHLTPLHTTTPPTTT 

CCCCTCAACCCCGAACACGCCATCATCATCACCGGCGGCTCCGGCACCCT 4 000 

PLN 1 PEHAI IITGGSGTL 
CGCCGGCA7CC7CGCCCGCCACCTGAACCACCCCCACACCTACCTCC7CT 4 0 50 
AG I 1ARHLNHPHTVLL 
40 CCCGCACCCCACCCCCCGACGCCACCCCCGGCACCCACCTCCCCTGCGAC 4 100 
SET ? r PD.ATPGTHLrCD 
GTCGGGGAGCCGCACCAACTCGCCACGACCCTCACCGACATCGGGCAA.ee 4 150 

VG Z PKQLATTLTH I PQP 
CCTCACCGCCA7CTTCCACACCGCCGCCACCCTCGACGACGGCATCCTCC 4 2 00 
45 LTAI FHTAATLDDGIL 

ACGCCC7CACCCCCGACCGCCTCACCACCGTCCTCCACCCCAAAGCCAAC 4 2 50 
HALT PDRLTTVLH PKA N 
GCCCCC7GGCACC7GCACCACCTCACCCAAAACCAACCCCTCACCCACTT 4 300 
AAWHLHHLTQNQPLTHF 
50 CGTCCTCTACTCCAGCGCCGCCGCCGTCCTCGGCAGCCCCGGACAAGGAA 4 350 
VLYSSAAAVLGS PGQG 
ACTACGCCGCCGCCAACGCCTTCCTCGACGCCCTCGCCACCCACCGCCAC 4 4 00 
N V A A A N A F L D A L A T H R H 
ACCCTCGGCCAACCCGCCACCTCCATCGCCTGGGGCATGTGGCACACCAC 4 4 50 
55 7LG0PATSIAWGMWKTT 

CAGCACGCTCACCCGACAACTCGACGACGCCGACCGGGACCC GATCCGCC 4 500 
5. 7 1 7 G Q T- D D A D ?. D P : R 

— — — — ~nm,-*~~~~.^.^s-~-~*~. ^^^^-.^ * *--"<■-'— - — — — 

RGGFLPITDDEG 

60 
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Phage KC5 15 DNA was prepared using the procedure described in Genetic 
Manipulation of Streptomyces. A Laboratory Manual, edited by D. Hopwood et al. A phage 
suspension prepared from 10 plates (100 mm) of confluent plaques of KC51 5 on 5. Uvidals 
TK24 generally gave about 3 ug of phage DNA. The DNA was ligated to circularize at the 
cos site, subsequently digested with restriction enzymes BamHl and Pstl. and 
dephosphorylated with SAP. 

Each module 8 cassette described above was excised with restriction enzymes Bgfll 
and Nsil and ligated into the compatible BamHI and Pstl sites of KC5 1 5 phage DNA 
prepared as described above. The ligation mixture containing KC515 and various cassettes 
was transfectcd into protoplasts of Streptomyces lividans TK24 using the procedure 
described in Genetic Manipulation of Streptomyces, A Laboratory Manual edited by D. 
Hopwood et al. and overlaid with TK24 spores. After 16-24 hr. the plaques were restreaked 
on plates overlaid with TK24 spores. Single plaques were picked and resuspended in 200 
uL of nutrient broth. Phage DNA was prepared by the boiling method (Hopwood et al.. 
supra). The PCR with primers spanning the left and right boundaries of the recombinant 
phage was used to verify the correct phage had been isolated. In most cases, at least 80% of 
the plaques contained the expected insert. To confirm the presence of the resistance marker 
(thiostrepton), a spot test is used, as described in Lomovskaya et al. (1997), in which a plate 
with spots of phage is overlaid with mixture of spores of TK24 and phiC31 TK24 lysogen. 
After overnight incubation, the plate is overlaid with antibiotic in soft agar. A working stock 
is made of all phage containing desired constructs. 

Streptomyces hygroscopicus ATCC 14891 (sec US Patent No. 3,244,592, issued 5 
Apr 1966, incorporated herein by reference) mycelia were infected with the recombinant 
phage by mixing the spores and phage ( 1 x 10 s of each), and incubating on R2YE agar 
25 (Genetic Manipulation of Streptomyces, A Laboratory Manual, edited by D. Hopwood et 
al.) at 30°C for 10 days. Recombinant clones were selected and plated on minimal medium 
containing thiostrepton (50 ug/ml) to select for the thiostrepton resistance-conferring gene. 
Primary thiostrepton resistant clones were isolated and purified through a second round of 
single colony isolation, as necessary. To obtain thiostrepton-sensitive revertants that 
30 underwent a second recombination event to evict the phage genome, primary recombinants 
were propagated in liquid media for two to three days in the absence of thiostrepton arid 
then spread on agar medium without thiostrepton to obtain spores. Spores were plated to 
obtain about 50 colonies per plate, and thiostrepton sensitive colonies were identified by 
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replica plating onto thiostrepton containing agar medium. The PCR was used to determine 
which of the thiostrepton sensitive colonies reverted to the wild type (reversal of the initial 
integration event), and which contain the desired AT swap at module 8 in the ATCC 1489 1 - 
derived cells. The PCR primers used amplified either the KS/AT junction or the AT/DH 
5 junction of the wild-type and the desired recombinant strains. Fermentation of the 

recombinant strains, followed by isolation of the metabolites and analysis by LCMS, and 
NMR is used to characterize the novel polyketide compounds. 

Example 2 

10 Replacement of Methoxyl with Hydrogen or Methyl at C-13 of FK-506 

The present invention also provides the 13-desmethoxy derivatives of FK-506 and 
the novel PKS enzymes that produce them. A variety of Streptomyces strains that produce 
FK-506 are known in the an, including S. tsukuhaensis No. 9993 (FERM BP-927), 
described in U.S. Patent No. 5,624,852, incorporated herein by reference; S. hygroscopicus 

1 5 subsp. yakushimaensis No. 7238, described in U.S. patent No. 4,894,366, incorporated 
herein by reference; S. sp. MA6858 (ATCC 55098), described in U.S. Patent Nos. 
5,1 16,756, incorporated herein by reference; and S. sp. MA 6548, described in Motamedi et 
al., 1998, "The biosynthetic gene cluster for the macrolactone ring of the 
immunosuppressant FK-506," Eur. J. Biochem. 256: 528-534, and Motamedi et aU 1997, 

20 "Structural organization of a multifunctional polyketide synthase involved in the 

biosynthesis of the macrolide immunosuppressant FK-506," Eur J, Biochem. 244: 74-80, 
each of which is incorporated herein by reference. 

The complete sequence of tjie FK-506 gene cluster from Streptomyces sp. MA6548 
is known, and the sequences of the corresponding gene clusters from other FK-506- 

25 producing organisms is highly homologous thereto. The novel FK-506 recombinant gene 
clusters of the present invention differ from the naturally occurring gene clusters in that the 
AT domain of module 8 of the naturally occurring PKSs is replaced by an AT domain 
specific for malonyl Co A or methylmalonyl Co A. These AT domain replacements are made • 
at the DNA level, following the methodology described in Example 1. 

30 The naturally occurring module 8 sequence for the MA6548 strain is shown below, 

followed by the illustrative hybrid module 8 sequences for the MA6548 strains. 

- - — — Ows_ . w T A C G AG G •. . _.-.C — oC3C/-.CCGGAAvjTCC*-'jTG*jTG^j : G ~ ~ 
M R L Y E A A. R R T G S P V V V 
GCGGCCGCGCTCGACCACGGGCCGGACGTGCCGCTGCTGCGCGGGCTGGG 100 

35 aaalddapdvpllrglr 
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rr-CGTACGACCGTCrGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 1 50 

R ? 7 V R R A A V R E R S LAO 
^CTCGCCGTGC7GCCCGACGACGAGCGCGCCGA"CC7CCC7CGCGTTCG 200 

• s f : ? t t s a ? r ? ? s r s 

- TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 2 30 

3 W N S TATVLGii LGAED I 
rCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 

PATTTFKELGID5LTA 
TCCAGC7GC3CAACGCGC7GACCACGGCGACCGGCGTACGCCTCAACGCC 3 50 
1 0 VQLRNALTTATGVRLNA 

ACAGCGGTC77CGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 

TAVFDFPTPRALAARLG 
CGACGAGCTGGCCGG7ACCCGCGCGCCCG7CGCGGCCCGGACCGCGGCCA 4 50 
DELAGTRAPVAARTAA 
1 5 CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 

TAAAH DEPLAIVGMACR 
CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

L PGGVA S PQELWRLV.AS 
CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
20 GT DAITEFPADRGWDV 

_ ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACC7TCGTCCGG 650 

- A L ':' 0 P D P D A I G K T F V R 
7ACGGCGGC7777TCGACGGTGCGACCGGCTT CGACGCGGCGTTCTTCGG 7 00 
H G G- F L DGA TG F D A A FFG 
25 GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 7 50 

ISPREALAMDPQQRVL 
. 7GGAGACGTCC7GGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
LETSWEAFESAGI TPDA 
GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 8 50 
30 ARGS DTGVFIGAFS Y G Y 

CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 

GTGADTNGFGATGSQT 
GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
SVLSGRLSYFYGLEGPS 
3 5 GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 

VTVDTACSSSLVALHQA 
AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 

GQS LRSGECSLALVGG 
7CACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
40 . 7 V MAS PGG FVE FS RQR 

3GGCTCGCGCCGGACGGGCGGGCGAAGGCG7TCGGCGCGGGCGCGGACGG 1150 

GLAPCGRAKAFGAGADG 
7ACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 
T S F A E G A G A L V V E R L S 
45 ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 12 50 

2AERHGH7VLALVRGSA 
GCTAAC7CCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

AN S DGA SNGLSA PNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
50 QERVIHQALANAKL7P 

eCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 
ADVOAVEAHGTGTRLGD 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC .14 50 
PI EAQALLATYGQ DRAT 
55 GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 

PLLLGSLKSNIGHAQA 
C G7CAGGGG7CGCCGGGA7CA7CAAGA7GG7GCAGGCCA7CCGGCACGGG 15 5 0 
.-. S G V A G I I K M V Q A I R H G 
GAACTGCCGCCGACACTCCACGCGGACGAGCCGTCGCCGCACGTCGAC7G 160 0 
60 ELPP-7LHADEP3PHVDW 
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GACGCCCGG7GCCC7CGAGC7CC7GACG7CGGCCCGGCCG7GGCCGGGGA 1650 

? A G A V E L L T S A R ? W ? G 
CCGG7CGCCCGCGCCGCGC7GCCG7C7CG7CG77CG3CG7GA.GCGGCACG 1700 

5 AA C G C 3 C AC A 7 C A 7 C C 7 7 G AG G C AGG AC C G G 7 C AAAAG G G G A C CG G T C G A 17 50 

N A H I I L E A G P V K ? 3 ? V E 

GGCAGGAGCGA7CGAGGCAGGACCGG7CGAAG7AGGACCGG7CGAGGC7G 18 0C 

A 3 A I E A G ? V E V 3 ? V E A 

GACCG37CCCCGCGGCGCCGCCG7CAGCACCGG3C3AAGAC377CCGC7G 18 50 

* 0 3 f lfaappsap3ed1pl 

c7cg7g7cggcgcg77ccccggaggcactcgacga.gcaga7cgggcgcc7 1 900 

lv5arspealdeqigrl 
gcgcgcc7a7c7cgacaccggcccgggcg7cgaccgggcggccg7ggcgc 1 950 
rayld7gpgvdraava 
1 5 agacac7ggcccggcg7acgcac77cacccaccgggccg7ac7gc7cggg 2000 
q7larr7hf7hravllg 
gacaccg7ca7cggcgc7ccccccgcggaccaggccgacgaac7cg7c77 2 050 

d7vi gappa dqade lvf 
cg7c7ac7ccgg7cagggcacccagca7cccgcga7gggcgagcaac7cg 2100 
20 v. ysgqgtqhpamgeql 

cggccgcg77ccccg7g7tcgccga7gcc7ggca.cgacgcgctccgacgg 2150 
a a a f f y f a d a w h 3 a l r r 
c7cgacgaccccgacccgcacgaccccacacgga.gccagcacacgctct7 2200 
lddpdphdptrsqhtlf 
25 cgcccaccaggcggcgttcaccgccctcctgaggtcctgggacatcacgc 22 50 
ahqaaftallrswdit 
cgcacgccg7catcggccac7cgc7cggcgaga7caccgccgcg7acgcc 2 300 
phavighslge: 7 a a y a 
gccggga7cc7g7cgc7cgacgacgcc7gcaccc7ga.7caccacgcg7gc 2350 
30 ag i lslddac7li77ra 

ccgcc7ca7gcacacgc77ccgccgcccggcgcca7gg7caccg7gc7ga 24 00 

rlmh7lpppgamv7vl 
ccagcgaggaggaggcccg7caggcgc7gcggccgggcg7ggaga7cgcc 24 50 
7seeearqalrpgveia 
35 gcgg7c77cggcccgcac7ccg7cg7gc7c7cgggcgacgaggacgccgt 2500 
av fg phsvvls-gde dav 
gc7cgacg7cgcacagcggc7cggca7ccaccaccg7c7gcccgcgccgc 2 5 50 

l dvaqrlgi hh r l ? a p 
acgcgggccactccgcgcacatggaacccgtggccgccgagctgctcgcc 2600 
40 hagh sahme pvaae l la 

accactcgcgagc7ccgt7acgaccggccccacagcgccatcccgaacga 2 650 

77relrydrph 7ai pnd 
ccccaccaccgccgag7ac7gggccgagcagg7ccgcaaccccg7gc7g7 27 00 
p77aeywaeqv ?. n ? v l 
45 7ccacgcccacacccagcgg7accccgacgccg7377cg7c3aga7cggc 27 50 
f h a h 7 q r y p d a v f v e i g 
cccggccaggacctc7caccgctggtcgacggca7cgccc7gcagaacgg 2800 

pgqdlsplvdg ialqng 
cacggcggacgagg7gcacgcgc7gcacaccgcgc7cgcccgcct0ttca 28 50 
50 7adevhalh7alarlf 

cacgcggcgccacgc7cgac7gg7cccgca7cc7cggcgg7gc77cgcgg 2 900 
7rga7ldwsri lggasr 
cacga.ccc7gacgtcccc7cg7acgcgt7ccagcggcgtccc7ac7ggat 2 950 
h dpdvpsyafq?. rpywi 
55 cgag7cggc7cccccggccacggccgactcgggcca.ccccg7cc7cggca 3000 
esappa7adsghpvlg 
ccgga.g7cgccg7cgccggg7cgccgggccggg737tcacggg7cccg7g 30 50 

7 G V A V A Ci $ P G R F T 3 ? V 

CCCGCCGGTGCGGACCGCCCGGTGTTCATCGCCGAACTGGCGCTCGCCGC 3100 
60 ? A G A DRAVFIAELALAA 
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CGCCGACGCCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCG 315 0 

ADATDCATVEQLDVTS 
TGCCCGGCGGATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGAT 32 00 
P G G S A ?. G R A T A 2- T W V D 
5 3AACCCGCCGCCGACGGGCGGCGCCGCT7CACCGTCCACACCCGCGTCGG 32 5 0 

SPAADG?. RRFTVHTP. VG 
•-GACGCCCCG7GGACGCTGCACGCCGAGGGGG7TC7CCGCCCCGGCCGCG 3300 

DAPWTLHAEGVLRPGR 
7GCCCCAGCCCGAAGCCG7CGACACCGCC7GGCCCCCGCCGGGCGCGGTG 3 3 5 0 
10 VPQPEAVDTAWPPPGAV 

CCCGCGGACGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGT 34 00 

?ADGLPGAWRRADQVFV 
wGAAGCCGAAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGC 34 50 
E A E V D S PDGFVAHPDL 
1 5 TCGACGCGGTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGA 3500 

LDAVFSAVGDGSRQPTG 
TGGCGCGACCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTG 3550 

WRDLAVHASDATVLRAC 
CCTCACCCGCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTG 3600 
20 LTRRDSGVVELA AFDG 

CCGGAATGCCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCG 3650 
AGMPVLTAESVTLGEVA 
TCGGCAGGCGGATCCG ACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTT 37 00 
SAGGSDES DGLLRLEWL 
25 3CCGGTGGCGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCT 3750 * 

PVAEAH Y DGADELPEG 
AC ACCCTC A7CACCGCC AC ACACCCCGACGACCCCG ACGACCCC ACCAAC 38 00 
VTLI TATH PDDPDDPTN 
CCCCACAACACACCCACACGCACCCACACACAAACCACACGCGTCCTCAC 38 50 
30 PHNTPTRTHTQTTRVLT 

CGCCC7CCAACACCACCTCATCACCACCAACCACACCCTCATCGTCCACA 3 900 

ALQHHL I 77NHTLIVH 
CCACCACCGACCCCCCAGGCGCCGCCG7CACCGGCC7CACCCGCACCGCA 3950 
777DPPGAAV7GL7R7A 
35 CAAAACGAACACCCCGGCCGCA7CCACC7CA7CGAAACCCACCACCCCCA 4 000 

QNEHPGRIHLIE7HHPH 
CACCCCAC7CCCCC7CACCCAAC7CACCACCC7CCACCAACCCCACC7AC 4 050 

7PLPL7QL77LHQPHL 
GCC7CACCAACAACACCC7CCACACCCCCCACC7CACCCCCA7CACCACC 4 100 
40 RL7NN7LH7PH. L7PI7T 

CACCACAACACCACCACAACCACCCCCAACACCCCACCCC7CAACCCCAA 4150 

HHN77777 PN7PPLNPN 
CCACGCCA7CC7CA7CACCGGCGGC7CCGGCACCC7CGCCGGCA7CC7CG 4 200 
HAILI7GGSG7LAGIL 
45 CCCGCCACCTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCA 4 2 50 

ARHLNHPHTYLLSRTPP. 
CCCCCCACCACACCCGGCACCCACATCCCCTGCGACC7CACCGACCCCAC 4 300 

PP77PG7H I PCDL7DP7 
CCAAA7CACCCAAGCCC7CACCCACA7ACCACAACCCC7CACCGGCA7C7 4 35 0 
50 QI7QAL7H I PQPL7GI 

TCCACACCGCCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCC 4 4 00 
FHTAATLDDATLTNLTP 
CAACACC7CACCACCACCC7CCAACCCAAAGCCGACGCCGCC7GGCACC7 4 4 50 
QHL777LQPKADAAWHL 
55 CCACCACCACACCCAAAACCAACCCC7CACCCAC77CG7CC7C7AC7CCA 4 500 

HHH7QNQPL7HFVLYS 
GCGCCGCCGCC ACCC7CGGC AGCCCCGGCCAAGCCAACT ACGCCGCCGCC 4 5 50 
3AAATLGS PGQANYAAA 
AACGCC77CC7CGACGCCC7CGCCACCCACCGCCACACCCAAGGACAACC 4 600 
60 NAFLDALATHRHTQGQ'P- 
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CGCCACCACCATCGCCTGGGGCA7G7GGCACACCACCACCACAC7CACCA 4 65 0 

ATTIAWGMWHTTTTLT 
GCCAACTCACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTG 4 7 00 
s v - T G 3 C ?. 0 R I 5. ft G C ~ L 
5 CCGA7C7CGGACGACGAGGGCA7GC 
PI S D D E G M 

The /lvrIK¥7ioI hybrid FK-506 PKS module 8 containing the AT domain of module 
12 of rapamycin is shown below. 

1 0 GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTCGTGGTG 5 0 
MRLYEAARR7GSPVVV 
GCGGCCGCCC7CGACGACGCGCCGGACG7GCCGC7GC7GCGCGGGC7GCG 100 

AAALDDAPDVPLLRGLR 
GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTC7CTCGCCGACC 150 
15 RTTVRRAAVRERSLAD 

GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
RSPCCPTTSAPTPPSR-S 
TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 2 50 
SWNSTATVLGHLGAEDI 
20 CCCGGCG ACGACGACG77CAAGGAACTCGGCATCGACTCGCTCACCGGGG 300 
PA777 FKELGI DSL7A 
TCCAGC7GCGCAACGCGC7GACCACGGCGACCGGCG7ACGCC7CAACGCC 3 5 0 
VQLR"NAL7TATGVRLNA 
ACAGCGG7C77CGAC777CCGACGCCGCGCGCGC7CGCCGCGAGAC7CGG 4 00 
25 7AV FDFPT PRALAARLG 

CG ACGAGCTGGCCGG7ACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 4 50 

DELAG7RAPVAAR7AA 
CCGCGGCCGCGCACGACGAACCGG7GGCGA7CG7GGGCA7GGCC7GCCG7 500 
7AAAHDEPLAIVGMACR 
30 C7GCCGGGCGGGGTCGCG7CGCCACAGGAGCTG7GGCGTCTCGTCGCG7C 5 50 
LPGGVAS PQELWRLVAS 
CGGCACCGACGCCATCACGGAG7TCCCCGCGGACCGCGGCTGGGACG7GG 600 

G7DA I 7E FP ADRGWDV 
ACGCGC7C7ACGACCCGGACCCCGACGCGA7CGGCAAGACC77CG7CCGG 650 
35 DALYDPDPDAIGK7FVR 

CACGGCGGC77CC7CGACGG7GGGACCGGC77CGACGCGGCG77C77CGG 7 00 

HGGFLDGA7GFDAAFFG 
G A7CAGCCCGCGCGAGGCCC7GGCCA7GGACCCGCAGCAACGGG7GC7CC 7 50 
I S PRE ALAM DPQQRVL 
40 TGGAGACG7CC7GGGAGGCG77CGAAAGCGCGGGCA7CACCCCGGACGCG 8 00 
LE7SWEAFESAG I7PDA 
GCGCGGGGCAGCGACACCGGCG7G7TCATCGGCGCGTTCTCCTACGGGTA 8 50 

ARGSD7GVFIGAFSYGY 
CGGCACGGG7GCGGA7ACCAACGGC77CGGCGCGACAGGG7CGCAGACCA 900 
45 G7GAD7NG FGA7GSQ7 

CCG7GCTCTCCGGCCGCC7C7CG7AC7TCTACGG7CTGGAGGGCCC77CC 9 50 
S V i L SG RLS Y FYG LEG PS 
G7CACGG7CGACACCGCC7GC7CG7CG7CAC7GG7CGCCC7GCACCAGGC 10 00 
V7VDTACS3SLVALHQA 
50 AGGGCAG7CCC7GCGC7CCGGCGAA7GC7CGC7CGCCC7GG7CGGCGG7C 10 50 
GQSLRSGECSLALVGG 
TCACGG7GA7GGCG7CGCCCGGCGGA77CG7CGAG77C7CCCGGCAGCGC 1100 
V7VMASPGGFVEFSRQR 
GGGC7CGCGCCGGACGGGCGGGCGAAGGCGT7CGGCGCGGGCGCGGACGG 1150 
55 GLA PDGRAKAFGAGADG 

. .'-.Cvj/-.u„ .. * CGCCGAGGGGGG CG G * G CCCT GGT GGT CG AGCGGC TCTCC C 1 ' 

7S FAEGAGALVVERLS 
ACGCGGAGCGCCACGGCCACACCG7CC7CGCCC7CG7ACGCGGC7CCGCG 12 5 0 
DAE R H G H T V LALVRGSA 
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GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1 ^00 

A N S CGASNGLSAPNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
- E R v - ! J Q ALA M A KIT? 

CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 
ADV 0AVEAHGTG7RLGD 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 14 50 

T EAQALLATYGQDP. AT 
GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 15 00 
? L L 1 G S L K'S N Z G H A Q A 

C37CAGGGG7CGCCGGGA7CATCAAGATGGTGCAGGCCATCCSGCACGGG 1550 
A S G V A G I I K M V Q A I R H G 
GAAC7GCCGCCGACACTGCACGCGGACGAGCCG7CGCCGCACG7CGACTG 1600 
^LPPTLHADEPS ? H V D W 

15 -ACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 
TAGAVELLTSARPWPG 
CCGGTCGCCCTAGGCGGGCAGGCGTGTCGTCCTTCGGGATCAGTGGCACC 1700 
TGRPRRAGVSSFGISGT 
AAC GCCC AC GTCATCCTGGAAAGCGCACCCCCC ACT CAGCCTGCGGACAA 17 50 

-° N A H V I L E SA P P T Q PA DN 

CGCGGTGATCGAGCGGGCACCGGAGTGGGTGCCGTTGGTGATTTCGGCCA 1800 
A V I ERAPEWV? L V I SA 

GGACCCAGTCGGCTT7GACTGAGCACGAGGGCCGG7TGCGTGCGTATCTG 1850 
RTQSALTEHEGRLRAYL 
25 GCGGCGTCGCCCGGGGTGGATATGCGGGCTGTGGCATCGACGCTGGCGAT 1900 
AASPGVDMRAVASTLAM 
GACACGGTCGGTGTTCGAGCACCGTGCCGTGCTGCTGGGAGATGACACCG 1950 

TRSVFEHRAVLLGDDT 
7CACCGGCACCGCTGTGTCTGACCCTCGGGCGGTGT7CGTCTTCCCGGGA 2000 
30 VTGTAVSDPRAVFVFPG 

CAGGGGTCGCAGCGTGCTGGCATGGGTGAGGAACTGGCCGCCGCGTTCCC 2050 

QGSQRAGMGEELAAAFP 
CGTCTTCGCGCGGATCCATCAGCAGGTGTGGGACCTGCTCGATGTGCCCG 2100 
VFARI HQQVWDL.LDVP 
35 ATCTGGAGGTGAACGAGACCGGTTACGCCCAGCCGGCCCTGTTCGCAATG 2150 
DLEVNETGYAQPALFAM 
CAGGTGGCTCTGTTCGGGCTGCTGGAATCGTGGGGTGTACGACCGGACGC 2200 

QVALFGLLE SWGVRPDA 
GGTGATCGGCCATTCGGTGGGTGAGCTTGCGGCTGCGTATGTGTCCGGGG 2250 
40 VI GHSVGELAAAYVSG 

TGTGGTCGTTGGAGGATGCCTGCACTTTGGTGTCGGCGCGGGCTCGTCTG 2300 
VWSLEDACTLVSARARL 
ATGCAGGCTCTGCCCGCGGGTGGGGTGATGGTCGCTGTCCCGGTCTCGGA 2 350 
MQAL PAGGVMVAVPVSE 
45 GGATGAGGCCCGGGCCGTGCTGGGTGAGGGTGTGGAGATCGCCGCGGTCA 2 4 00 
DEARAVLGEGVE IAAV 
ACGGCCCGTCGTCGGTGGTTCTCTCCGGTGATGAGGCCGCCGTGCTGCAG 24 50 
NGPSSVVLSGDEAAVLQ 
GCCGCGGAGGGGCTGGGGAAGTGGACGCGGCTGGCGACCAGCCACGCGTT 2 500 
50 AAEGLGKWTRLATSHAF 

CCATTCCGCCCGTA7GGAACCCATGCTGGAGGAGT7CCGGGCGG7CGCCG 2 5 50 

HSARMEPMLE.E F P. A V A 
AAGGCCTGACCTACCGGACCCCGCAGGTCTCCATGGCCGTTGGTGATCAG 2 600 
EGLTYRTPQVSMAVGDQ 
55 ^'"'GACCACCGCTGAGTACTGGGTGCGGCAGGTCCGGGACACGGTCCGGTT 2 650 
v T T A E Y W V R Q V R D T V R F 
CGGCGAGCAGGTGGCCTCGTACGAGGACGCCGTGTTCGTCGAGC7GGGTG 2 7 00 

3 E Q V A 5 V E D A V F V E L G 
CCGACCGGTCACTGGCCCGCCTGGTCGACGGTGTCGCGATGCTGCACGGC 2 7 50 
60 ADR.SLARLVDGVAMLHG 
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GACCAC 3AAATCCAGGCCGCGATCGGCGCCCTGGCCCACCTGTATGTCAA 2800 

- H =- I Q A A I GALAHLYVN 
CGGCGTCACGGTCGACTGGCCCGCGCTCCTGGGCGATGCTCCGGCAACAC 28 50 
^ '•* 7 " D W P A L L G D A ? A T 
5 GGG7GC73GACC77CCGACA7ACGCC7TCCAGCACCAGCGCTAC7GGCTC 2 900 
R V 1 DLP7YAFQHQRYWL 
GAG7CGGCTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCC7CGGCAC 2 950 

E S A ? PATADSGHPVLG7 
CGGAG7CGCCG7CGCCGGG7CGCCGGGCCGGG7GT7CACGGG7CCCG7GC 3000 
*0 3 7 A V A G 5 P G R V F T G P V 

CCGCCGG7GCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGCC 3050 
PAGATRAVFIAELALAA 
GCCGACGCCACCGACTGCGCCACGGTCGAACAGC7CGACGTCACCTCCGT 3100 
A D A 7 D C A 7 V E Q L D V 7 S V 
15 GCCCGGCGGATCCGCCCGCGGCAGGGCCACCGCGCAGACC7GGG7CGATG 3150 
PGGSARGRATAQTWVD 
AACCCGCCGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGGC 3200 
E PAADGRRRFTVHTRVG 
GACGCCGCG7GGACGG7GCACGCCGAGGGGG77C7CCGCCCCGGCCGCG7 3250 
20 DAPW7LHAEGVLR PGRV 

GCCCCAGCCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTGC 3300 

FQPEAVDTAWPPPGAV 
CCGCGGACGGGC7GCCCGGGGCG7GGCGACGCGCGGACCAGG7C77CG7C 3350 
PADG L PGAWRRADQVFV 
25 GAAGCCGAAG7CGACAGCCC7GACGGC77CG7GGCACACCCCGACC7GC7 34 00 . 
EAEVDSPDGFVAHPDLL 
CGAGGCGG7C77C7CCGCGG7CGGCGACGGGAGCCGCCAGCCGACCGGA7 34 50 

DAVFSAVGDGSRQPTG 
GGCGCGACC7CGCGG7GCACGCGTCGGACGCCACCG7GC7GCGCGCCTGC 3500 
30 WR2 LAVHASDATVLRAC 

C7CACCCGCCGCGACAGTGG7GTCGTGGAGCTCGCCGCCTTCGACGGTGC 3550 

LTRRDSGVVELAAFDGA 
CGGAA7GCCGG7GC7CACCGCGGAG7CGG7GACGC7GGGCGAGG7CGCG7 3600 
GMPVL7AESV7LGEVA 
35 CGGCAGGCGGA7CCGACGAG7CGGACGG7C7GC77CGGC77GAG7GG77G 3650 
S AG G S DESDGLL RLE WL 
CCGG7GGCGGAGGCCCAC7ACGACGG7GCCGACGAGC7GCCCGAGGGC7A 3700 

PVAEAHYDGADELPEGY 
CACCC7CA7CACCGCCACACACCCCGACGACCCCGACGACCCCACCAACC 37 50 
40 7LI 7A7HPDDPDDP7N 

CCC AC AAC AC ACCC ACACGCACCCAC ACACAAACCACACGCG7CC7CACC 3800 
PHN7 P7R7H7Q77RVL7 
GCCC7CCAACACCACC7CA7CACCACCAACCACACCC7CA7CG7CCACAC 38 50 
ALQHHLI77NH7LIVHT 
45 CACCACCGACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCAC 3 900 
77DP PGAAV7GL7R7A 
A.aAACGAACACCCCGGCCGCA7CCACC7CA7CGA.a L ACCCACCACCCCCAC 3 950 
QNEHPGRIHLIETHHPH 
ACCCCAC7CCCCC7CACCCAAC7CACCACCC7CCACCAACCCCACC7ACG 4 000 
50 TFLFLTQ.LTTLHQPHLR 

CCTCACCAACAACACCC7CCACACCCCCCACCTCACCCCCATCACCACCC 4 0 50 

-7NN7LH7PHL7PI77 
ACCACA.ACACCACCACAACCACCCCCAACACCCCACCCC7CAACCCCA.^C 4 100 
H H K 7 7 7 T 7 P N 7 P P L N P N 
55 CACGCCA.7CC7CA7CACCGGCGGCTCCGGCACCC7CGCCGGCA7CCTCGC 4 150 
H A I LITGGSGTLAGILA 
CCG7CACC7CA.^CCACCCCCACACC7ACC7CC7C7CCCGCACACCACCAC 4 200 

• * H ? i*i 7 Y L L 3 R 7 r P 

CCCCCACCACACCCGGCACCCACATCCCCTGCGACC7CACCGACCCCACC 4 250 
60 PP77PGTH I PC DL7DP7. 
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-AAATCACCCAAGGCCTCACCCACATACCACAACGCCTCACCGGGATCTT 4 300 

SITQALTHIPQPLTGIF 
CCACACCGCCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCCC 4 3 50 

H T A A T L 0 D A T L T N L T P 

5 AACACCTCACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCTC 4 4 00 
QHLTTTLQ PKADAAWHL 
CACCACCACACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAG 4 4 50 

HHHTQNQPLTHFVLYSS 
CGCCGCCGCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCCA 4 500 
10 AAATLGSPGQANYAAA 

ACGCC7TCC7CGACGCCC7CGCCACCCACCGCCACACCCAAGGACAACCC 4 550 
NAFLDALATHRKTQGQP 
GCCACCACCATCGCCTGGGGCATGTGGCACACCACCACCACACTCACCAG 4 600 
ATT IAWGMWH7TTTLTS 
1 5 CCAAC7CACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGC77CC7GC 4 650 
QL7D3DRDR IRRGGFL 
CGATCTCGGACGACGAGGGCATGC 
PTSDDEGM 

20 The Avrll-XJtol hybrid FK-506 PKS module 8 containing the AT domain of module 

1 3 of rapamycin is shown below. 

GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCG7GGTGGTG 50 

M RLYEAARR7GSPVVV 
GCGGCCGCGC7CGACGACGCGCCGGACG7GCCGC7GCTGCGCGGGC7GCG 100 
25 AAALDDAPDVPLLRGLR 

GCG7ACGACCG7CCGGCG7GCCGCCG7CCGGGAACGC7CTC7CGCCGACC 150 

RTTVRRAAVRERSLAD 
GC7CGCCG7GC7GCCCGACGACGAGCGCGCCGACGCC7CCC7CGCG77CG 200 
RSPCCP77SAPTPPSRS 
30 T CC7GG AACAGCACCGCCACCG7GC7CGGCCACC7GGGCGCCGAAGACA7 2 50 
SWNSTA7VLGHLGAEDI 
CCCGGCG ACGACGACGT7CAAGGAACTCGGCA7CGAC7CGCTCACCGCGG 3 00 

PATTTFKELGI DSLTA 
7CCAGC7GCGCAACGCGC7GACCACGGCGACCGGCG7ACGCC7CAACGCC 3 50 
-35 VQLRNALT7A7 GVRLNA 

ACAGCGG7CTTCGAC77TCCGACGCCGCGCGCGC7CGCCGCGAGAC7CGG 4 00 

7 A V FDFP7 PRALAARLG 
CGACGAGCTGGCCGG7ACCCGCGCGCCCG7CGCGGCCCGGACCGCGGCCA 4 50 
DELAG7RAPVAARTAA 
40 CCGCGGCCGCGCACGACGAACCGC7GGCGA7CG7GGGCA7GGCC7GCCG7 500 
7AAAHDE PLA IVGMACR 
C7GCCGGGCGGGG7CGCG7CGCCACAGGAGC7G7GGCG7C7CG7CGCG7C 5 50 

LPGGVAS PQELWRLVAS 
CGGCACCGACGCCA7CACGGAG77CCCCGCGGACCGCGGC7GGGACG7GG 600 
45 G7 DA I 7 E F PADRGWDV 

ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCG7CCGG 6 50 
DALYDPDPDA IGK7FVR 
CACGGCGCCTTCC7CGACGGTGCGACCGGC77CGACGCGGCG77C77CGG 700 
HGG FLDGA7G FDAAFFG 
50 GATCAGCCCGCGCGAGGCCCTGGCCA7GGACCCGCAGCAACGGGTGC7CC 7 5 0 
ISPREALAMDPQQRVL 
TGGAGACG7CC7GGGAGGCGT7CGAAAGCGCGGGGA7CACCCCGGACGGG 8 00 
LETSWEAFE SAG ITPDA 
GCGCGGGGCAGCGACACCCCCG7G77CA7CGGCGCG77CTCC7ACGGGTA S 50 
55 A R G S D T G V F I G A F S Y G Y 

— vjo i — — G G A T A C 3 A--. C ^ j 3 ~ T 3 GG C 3 3 G AC AGG GT CGGAG ACC A i' 3 «.- 

G7GAD7NG FGA7GSQ7 
GGG7GC7G7CCGGCCGCC7C7CG7AC77C7ACGG7C7GGAGGGCCC77CG 950 
SVLSGRL5Y. FYGLEGPS 
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3TCACGGTCGACACCCCCTGCTCGTCGTCACTGGTCCCCCTGCACCAGGC 10C0 
v T V D T A C 3 S S L V A L H Q A 

AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTGGGCGGTG 1050 
3 0 S L ?. 5 G E C S L A L V G G 
S TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
V T V M A S P G G F V E r 3 R Q R 
GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 

GLAPDGRAKAFGAGADG 
TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG- 12 00 

10 tsfaegagalvverls 

acgcggagcgccacggccacaccgtcctcgccctcgtacgcggctccgcg 12 50 
-■ a e r h g h t v l a l v r g s a 
gctaactccgacggcgcgtcgaacggtctgtcggcgccgaacggcccctc 1300 
ansdgasngl5apngps 
1 5 ccaggaacgcgtcatccaccaggccctcgcgaacgcgaaactcacccccg 1350 
qerv i hqalanaklt ? 
ccgatgtcgacgcggtcgaggcgcacggcaccggcacccgcctcggcgac 14 00 
advdaveahgtgtrlgd 
cccatcgaggcgcaggcgctgctcgcgacgtacggacaggaccgggcgac 14 5 0 
20 ? ieaqalla tygqdrat 

3cccctgctgctcggctcgctgaagtcgaacatcgggcacgcccaggccg 1500 

plllgslksnigha. -qa 
cgtcaggggtcgccgggatcatcaagatggtgcaggccatccggcacggg 1550 
asgvagi i kmvqai rhg 
25 gaactgccgccgacactgcacgcggacgagccgtcgccgcacgtcgactg 1600 
elpptlhadepsphvdw 
gacggccggtgccgtcgagctcctgacgtcggcccggccgtggccgggga 1650 

tagavelltsarpwpg 
ccggtcgccctaggcgggcgggcgtgtcgtccttcggagtcagcggcacc 1700 
30 7grprragvss fgvsgt 

aacgcccacgtcatcctggag agcgcaccccccgctcagcccgcggagga 17 50 

nahvi lesappaqpaee 
ggcgcagcctgttgagacgccggtggtggcctcggatgtgctgccgctgg 1800 
aqpvetpvvasdvlpl 
* 35 tgatatcggccaagacccagcccgccctgaccgaacacgaagaccggctg 13 50 
v i s a k t q p a l t e h e d r l 
cgcgcctacctggcggcgtcgcccggggcggatatacgggctgtggca7c 1900 

raylaas pgadiravas 
gacgctggcggtgacacggtcggtgttcgagcaccgcgccgtactccttg 1950 
40 .tlavtrsvfehravll 

gagatgacaccgtcaccggcaccgcggtgaccgaccccaggatcgtgttt 2 000 
gddtvtgtavtdprivf 
gtctttcccgggcaggggtggcagtggctggggatgggcagtgcactgcg 2 050 
vfpgqgwqwlgmgsalr 
45 cgattcgtcggtggtgttcgccg agcggatggccgagtgtgcggcggcgt 2100 
ds svvfae rmaecaaa 
tgcgcgagttcgtggactgggatctgttcacggttctggatgatccgfccg 2150 
lre fvdwdl ftvlddpa 
gtggtggaccgggttgatgtggtccagcccgcttcctgggcgatgatggt 2200 
50 7 vdrvdvvqpaswammv 

ttgcctggccccggtgtggcaggcggccggtgtgcggccggatgcggtga 22 50 

slaavwqaagvrpdav 
7cggccattcgcagggtgagatcgccgcagcttctgtggcgggtgcggtg 2 300 
:ghsqgeiaaacvagav 
55 7cactacccgatcccgcccggatcgtgacct7gcgcagccaggcgatcgc 2 3 50 

S L R D A A R i V T h R S Q A I A 
GCGGGGCCTDGCGGGCCGGGGCGCGATGGCATCCGTCGCCGTGCCCGCGC 2 A oc 

--- C L A G K G A M A i; V A L ? A 
AGGATGTCGAGCTGGTCGACGGGGCCTGGA.TCGCCGCCCACAACGGGCCC 2 4 50 
60 IDVELVDGAWIAAHNGP 
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gcctcc.-.rcgt3a7cgcgggcaccccggaagcggtc3accatctcctcac 2 sco 

stv:agtpeavdhvlt 
cgc7catg aggc.-.g aaggggtgcgggtgcggcgg at g.-.ccg7cgactatg 2 5 5c 

■~- h z .-. g v r v r r : 7 7 d v 

5 cc7cgca7acccggcacgtcgagctgatccgcgacgaac7ac7cgacatc 2 600 
a s ;{ 7 ? h v e l i r d z l l d i 

ACTAGCGACAGCAGCTCCCAGACCCCGCTCGTGCCG7GGCTGTCGACCGT 2 650 

'-"SZSSSQTPLVPWLSTV 
GGACGGCACCTGGGTCGACAGCCCGCTGGACGGGGAGTACTGGTACCGGA 2700 
10 DGTWVDSPLDGEYWYR 

ACCTGCG7GAACCGGTCGGTTTCCACCCCGCCGTCAGCCAGTTGCAGGCC 27 50 
NLREPVGFHPAVSQLQA 
CAGGGCGACACCGTGTTCGTCGAGGTCAGCGCCAGCCCGGTGTTGTTGCA 2800 
QGDTVFVEVSASPVLLQ 
1 5 GGCGATGGACGACGATGTCGTCACGGTTGCCACGCTGCGTCGTGACGACG 28 50 
AM DDDVVTVATLRRDD 
GCGACGCZACCCGGATGCTCACCGCCCTGGCACAGGCCTATGTCCACGGC 2900 
G DA T ?, MLTAL AQA YVHG 
GTCACCG7CGAC7GGCCCGCCATCCTCGGCACCACCACAACCCGGGTACT 2 950 
20 V T V D W P A I L G T T T T R V L 

GGACCT7CCGACCTACGCCTTCCAACACCAGCGGTAC7GGCTCGAGTCGG 3000 

0 L ? T Y A F Q H Q R Y W L E S 
CTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCC7GGGCACCGGAGTC 3050 
APPATADSGHPVLGTGV 
25 GCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTGCCCGCCGG 3100 
AVAGSPGRVFTGPVPAG 
TGCGG ACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGCCGCCGACG 3150 

AD RAVFIAELALAAAD 
CCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCGTGCCCGGC 3200 
30 ATDCATVEQLDVTSVPG 

GGATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGATGAACCCGC 3250 

GSARGRATAQTWVDEPA 
CGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGGCGACGCCC 3300 
ADG RRRFTVHTRV GDA 
35 CGTGGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCGTGCCCCAG 3350 
PWT LHAEGVLRP GRVPQ 
CCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTGCCCGCGGA 34 00 

? E ' A V DTAW P P P G A V PAD 
CGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGTCGAAGCCG 34 50 
40 G L ? G A W R R A D Q V F V E A 

AAGTCGACAGCCC7GACGGCTTCGTGGCACACCCCGACCTGCTCGACGCG 3500 
EVCS PDG FVAHPDLLDA 
GTCTTCTCCCCGG7CGGCGACGGGAGCCGCCAGCCGACCGGATGGCGCGA 3550 
VFSAVGDGSRQPTGWRD 
45 CC7CGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTGCCTCACCC 3600 
LAVHASDATVLRACLT 
GCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTGCCGGAATG 3650 
RRDSGVVELAAFDGAGM 
CCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCGTCGGCAGG 37 00 
50 ?VL TAESVTLGEVASAG 

CGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTTGCCGGTGG 37 50 

GS DESDGLLRLEWLPV 
CGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCTACACCCTC 38 00 
AE'AHYDGADELPEGYTL 
55 ATCACCGCCACACACCCCGACGACCCCGACGACCCCACCAACCCCCACAA 38 50 
ITATHPDDPDDPTNPHN 
7ACACCCACACGCA7CCACACACAAACCACACGCG7C77CACCG7CC7CC 3 900 

7 ? T r. 7 H T Q T T R V L T A L 
AACACCACCTCATCACCACCAACCACACCCTCATCG7CCACACCACCACC 3950 
60 Q H H L I T 7 N H T L I 7 H 7 T T 
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GACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCACAAAACGA 4 000 

0 P F G A A V T G L T R T A Q N E 
ACACCCCGGCCGCATCCACCTCATCGAAACCCACC ACCCCCACACCCCAC 4 050 

; " : ? C F; ■ .■• L I E T H :H ? H 7 = 

- TCCCCCTCACCCAACTCACCACCCTCCACCAA.CCCCACCTACGCCTCACC 4 100 
L ? L T Q L T T L H Q ? H L R 1 T 

AACAACACCC7CCACACCCCCCACC7CACCCCCA7CACCACCCACCACAA 4 150 
N N T L H T P H L T ? I ? T H H N 

CACCACCACAACCACCCCCAACACCCCACCCCTCA.^CCCCAACCACGCCA 4 200 
10 "TTTTPNTPP^NPNHA 

TCC7CA7CACCCGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCC3CCAC 4 250 
I1ITGGSGTLAGILARH 
CTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCACCCCCCAC 4 300 
L N H PHTYLLSRTPPPPT 
1 5 CACACCCGGC ACCCACATCCCCTGCGACCTCACCGACCCCACCCAAATCA 4 350 
TPGTHIPCDLTDPTQI 
CCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCTTCCACACC 4 4 00 
TQALTH T PQPLTGI F H 7 
GCCGCCACCC7CGACGACGCCACCC7CACCAACC7CACCCCCCA.ACACC7 4 4 50 
20 AATLDDATLTNLTPQHL 

CACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCTCCACCACC 4 500 

TTTLQPKADAAWHLHH 
ACACCCAAAACCAACCCC7CACCCAC77CG7CC7C7AC7CCAGCGCCGCC 4 550 
HTQNQPLTHFVL YSSAA 
25 GCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCCAACGCCTT 4 600 
ATLG3 PGQANYAAANAF 
CCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACCCGCCACCA 4 600 

LDALATHRHTQGQPAT 
CCATCGCCTGGGGC ATGTGGCACACCACCACCACACTC ACCAGCCAACTC 4 7 00 
30 TIAWGMWHTTTTLTSQL 

ACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTGCCGATCTC 4 7 50 

7DSDRDRIRRGGFLPIS 
GGACGACGAGGGCATGC 
D D E G M 



35 



The Nhel-XJjol hybrid FK-506 PKS module 8 containing the AT domain of module 
12 of rapamycin is shown below. 



GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 5 0 
MRLYEAARRTGSPVVV 
40 CCGGCCGCGCTCG^ZGAGGGGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 
AAALCDAPDV PLLRGLR 
GCGTACGACCG7CCGGCGTGCCGCCGTCCGGGAAGGCTCTCTCGCCGACC 150 

RTTVRRAAVR ERSLAD 
GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
45 RSPCCPTTSAPTPPSRS 

TCC7GGAACAGCACCGCCACCCTGCTCGGCCACCTGGGCGCCGAAGACAT 2 50 

SWNSTATVLGHLGAEDI 
CCCGGCGACGACGACGTTCAAGGAACTCGGCA7CGACTCGCTCACCGCGG 300 
PATTTFKELGI DSLTA 
50 7CCAGC7GCGCAACGCGC7CACCACGGCGACCGGCG7ACGCC7CAACGCC 350 
VQLRNALTTATGVRLNA 
AC AGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 

TAVFDFPT PRALAARLG 
CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 4 50 
55 DELAGTRAPVAARTAA 

T A A A H DE PLA I V G M A C R 
C7GCCSGGCGCGG7CGCG7CGCCACAGGAGCTG7GGCGTC7CGTCGCGTC 5 5 0 

L ? G G V A S P Q E L W P. L V A 5 
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CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 

GTDAI 7 E F PADRGWDV 
ACGCGCTCT ACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
"* A 1 : D P D ? 2 & 1 G K ? F V f, 
3 CACGGCGGCT7CCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 700 
HG GFLDGA?GFDAAFFG 
GA7CAGCCCCCGCGAGGCCC7GGCCATGGACCCGCAGCAACGGGTGCTCC 7 50 

ISPREALAMDPQCRVL 
TGGAGACGTCCTGGGAGGCGTTCGP.AAGCGCGGGCATCACCCCGGACGCG 300 
10 -ETSWEAFESAGITPDA 

GCGCGGGGCAGCGACACCGGCGTG7TCATCGGCGCGTTC7CCTACGGG7A 8 50 

ARGSDTGVFIGAFSYGY 
CGCCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 
GTGADTNG FGATGSQT 
1 5 GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
SVLSGRLSYFYGLEGPS 
GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 

VTVDTAC S S SLVALHQ.A 
AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 
-0 GQSLRSGECSLALVGG 

TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
VTVMASPGGFVEFSRQR 
GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 
CLAP. DGRAKAFGAGADG 
25 TACGAGCTTCGCCGAGGGCGCCGG7GCCCTGGTGGTCGAGCGGCTCTCCG 1200 
TS FAEGAGALVVERLS 
A.CGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 
DAERHGHTVLALVRGSA 
GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 
30 ANSDGASNGLSAPNGPS 

CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 

QERV I HQALANAKL7P 
CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 
ADVDAVEAH G.TGTRLGD 
35 CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 14 50 
PIEAQALLATYGQDRAT 
GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 

PLLLGSLKSNIGHAQA 
CG7CAGGGG7CGCCGGGA7CA7CAAGA7GG7GCAGGCCA7CCGGCACGGG 15 50 
40 A5G 7 AG I I KM VQAI RHG 

GAAC7GCCGCCGACAC7GCACGCGGACGAGCCG7CGCCGCACGTCGAC7G 1 600 

ELPP7LHADEPSPHVDW 
GACGGCCGGTGCCGTCGAGC7CCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 
TAGAVELLTSARPWPG 
45 CCGGTCGCCCGCGCCGCGCTGCCGTCTCGTCGTTCGGCG7GAGCGGCACG 17 00 
7GRPRRAAVSSFGVSG7 
AACGCCCACA7CA7CC77GAGGCAGGACCGG7CAAAACGGGACCGG7CGA 17 50 

NAH I ' I LE'AG PVKTGPVE 
GGCAGGAGCGA7CGAGGCAGGACCGG7CGAAG7AGGACCGG7CGAGGC7G 18 00 
50 AGAIEA.GPVEVGPVEA 

GAGCGCTCCCCGCGGCGCCGCCGTCAGCACCGGGCGAAGACCTTCCGCTG 18 50 
GPLPAAPPSAPGEDLPL 
C7CG7G7CGGCGCC77CCCCGGAGGCAC7CGACGAGCAGA7CGGGCGCC7 1900 
LVSARSPEALDEQIGRL 
55 GCGCGCC7A7C7CGACACCGGCCCGGGCG7CGACCGGGCGGCCG7GGCGC 1950 
RAYLD7G PGVDRAAVA 
AGACAC7GGCCCGGCGTACGCACTTCACCCACCGGGCCG7ACTGCTCGGG 2000 
Q T L • A R R 7 H F T H R A V L- L G 
GACACCG7CA7CGGCGC7CCCCCCGCGGACCAGGCCGACGAAC7CG7C77 2050 
60 07 V I G A P P A DQA DE LV F 
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CC-TCTACTCCGGTCAGGGCACCCAGCATCCCGCGATGGGCGAGCAGCTAG 2100 

VYSGQGTQH P A M G E Q L 
CCGCCGCGTTCCCCGTCTTCGCGCGGATCCATCAGCAGGTGTGGGACCTG 215 0 

A A A F ? 7 F A R I H Q Q V W D L 

5 -"GATGTGCCCGA.7CTGGAGG7GAACGAGACCGGTTACGCCCAGCCGGC 2 2 00 
- D V PDLEVNETGYAQPA 
3C7GTTCGCAATGCAG3TGGCTCTGTTCGGGCTGCTGGAATCGTGGGGTG 22 5 0 
- F A M 0 7 A L F G L L E S W G 

7ACGACCGGACGCGG7GATCGGCCATTCGG7GGGTGAGCTTGCGGCTGCG 2 300 
10 P. P D A V I G H S V G E L A A A 

TATG7G7CCGGGG7G73G7CG77GGAGGATGCC7GCACTTTGG7GTCGGC 2 3 50 

V V S G V W S L E D A C T L V S A 
GCGGGCTCGTCTGA7GCAGGC7CTGCCCGCGGG7GGGGTGATGGTCGCTG 2 4 00 
SlARLMQALPAGGVMVA 
1 5 7CCCGG7C7CGGAGGA7GAGGCCCGGGCCG7GC7GGG7GAGGG7G7GGAG 24 50 
VPVSEDEARAVLGEGVE 
ATCGCCGCGGTCAACGGCCCGTCGTCGGTGGTTCTCTCCGGTGATGAGGC 2500 

IAAVNGPSSVVLSGDEA 
CGCCG7GCTGCAGGCCGCGGAGGGGCTGGGGAAG7GGACGCGGCTGGCGA 2550 
20 A VLQAAEGLGKWTRLA 

CCAGCCACGCGT7CCAT7CCGCCCGTATGGAACCCATGCTGGAGGAGTTC 2 600 
7 S HAFH SARME PMLEEF 
CGGGCGGTCGCCGAAGGCCTGACCTACCGGACGCCGCAGGTC7CCATGGC 2 650 
?-AVAEGLTYRTPQVSMA 
25 CGTTGGTGATCAGGTGACCACCGCTGAGTACTGGGTGCGGCAGGTCCGGG 2700 
V G D Q 7 7 7 A E Y W V R Q V R 
ACACGG7CCGG77CGGCGAGCAGG7GGCCTCG7ACGAGGACGCCGTGTTC 27 50 ^ 
DTVRFGEQVASYEDAVF 
G7CGAGC7GGG7GCCGACCGG7CAC7GGCCCGCCTGGTCGACGGTGTCGC 28 00 
30 VELGADRS LARLVDGVA 

GATGCTGCACGGCGACCACGAAATCCAGGCCGCGATCGGCGCCCTGGCCC 28 50 * * 

MLHGDHEIQAAIGALA 
ACC7GTATG7CAACGGCGTCACGGTCGACTGGCCCGCGCTCCTGGGCGA7 2 900 
HLYVNGV7VDWPAL LGD 
35 3C7CCGGCAACACGGG7GCTGGACCTTCCGACA7ACGCCTTCCAGCACCA 2950 ■* 
APA7RVLDLPT YAFQHQ 
GCGC7ACTGGC7CGAGTCGGCTCCCCCGGCCACGGCCGACTCGGGCCACC 3000 

RY. WLESAP PATADSGH 
CCG7CC7CGGCACCGGAGTCGCCGTCGCCGGGT CGCCGGGCCGGG7GTTC 3050 _ 
40 PVLG7 0VAVAGSPGRVF 

ACGGGTCCCGTGCCCGCCGGTGCGGACCGCGCGGTGTTCATCGCCGAACT 3100 

TGPVPAGADRA VFIAEL 
GGCGCTCGCCGCCGCCGACGCCACCGACTGCGCCACGGTCGAACAGCTCG 3150 
ALAAADATDCATVEQL 
45 ACGTCACCTCCGTGCCCGGCGGATCCGCCCGCGGCAGGGCCACCGCGCAG 32 00 
3V7SVPGGSARGRA7AQ 
ACCTGGGTCGATGAACCCGCCGCCGACGGGCGGCGCCGCTTCACCGTCCA 3250 

7WVDEPAADGRRRF7VH 
3ACCCGCGTCGGCGACGCCCCGTGGACGCTGCACGCCGAGGGGGTTCTCC 3 300 
50 7RVGDAPWTLHAEGVL 

GCZCCGGCCGCGTGCCCCAGCCCGPAGCCGTCGACACCGCCTGGCCCCCG 3 3 50 
-£?GRVPQPEAVDTAWPP 

rC3GGCGCGG7GCCC3CGGACGGGCTGCCCGGGGCG7GGCGACGCGCGGA 34 00 
rGAVPADGL PGAWRRAD 
55 rCAGGTCTTCGTCGAAGCCGAAGTCGACAGCCCTGACGGCTTCGTGGCAC 34 50 
Q V F V E A E V - D S P D G F V A 
ACCCCGACCTGCTCGACGCGGTCTTCTCCGCGGTCGGCGACGGGAGCCGC 3 500 
r. ? DLLDAVFSAVGDG5R 
CAGCCGACCGGATGGCGCGACCTCGCGGTGCACGCGTCGGACGCCACCGT 35 50 
60 ~: PTG WRDL AVHASDATV 
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GC7GCGCGCC7GCCTCACCCGCCGCGACAG7GG7G7CGTGGAGC7CGCCG 3600 
L R A C L T R R D S G V V E L A 

CCTTCGACGGTGCCGGAATGCCGGTGCTCACCGCGGAGTCGGTGACGCTG 3650 
A F G G A G M P V L T A I S 7 T 

5 GGCGAGG7CGCGTCGGCAGGCGGATCCGACGAGTCGGACGC7CTGCTTCG 37 00 
GEVASAGGSDES OGLLR- 
GCTTGAGTGGTTGCCGGTGGCGGAGGCCCACTACGACGGTGGGGACGAGt 37 50 

LEWLPVAEAHYDGADE 
TGCCCGAGGGCTACACCCTCATCACCGCCACACACCCCGACGACCCCGAC 3800 
10 L^iGYTLITATHrijCPD 

GACCCCAGGAACCCCCACAACACACCCACACGCACCCACACACAAACCAC 38 50 

DP7N PHN7P7R7H7Q77 
ACGGC7CC7GACCGGCG7GCAACACCACC7CATGAGCACCAACGACACCG 3900 
RV1TALQHHLITTNHT 
15 TCATCGTCCACACCACCACCGACCCCCCAGGCGCCGCGGTCACCGGCCTC 3950 
LIVHTTTDPPGAAVTGL 
ACCCGCACCGCACAAAACGAACACCCCGGCCGCATCCACCTCATCGAAAC 4 000 

TRTAQNEHPGRIHLIET 
CCACCACCGCCACACCCCAC7CCCCC7CACCCAAC7GA.CCACCC7CCACC 4 050 

20 hh?htplpltqlt"tlh 

aaccccacctacgcctcaccaacaacaccctccacaccccccacctcacc 4 100 
qph lrl7nn7lh7phl7 
cccatcaccacccaccacaacaccaccacaaccacccccaacaccccacc 4 150 
pi 77hhnt7777pn7pp 
25 cctcaaccccaaccacgccatcctcatcaccggcggctccggcaccctcg 4 200 • 
ln pnhailitggsgtl 
ccggcatcctcgcccgccacctcaaccacccccacacctacctcctctcc 4 250 
agi larhlnhph7ylls 
cgcacaccaccaccccccaccacacccggcacccacatcccctgcgacct 4 300 
30 rt?pppttpgth:? cdl 

caccgaccccacccaaatcacccaagccctcacccaca7accacaacccc 4 350 

tdptqitqalth i pqp 
7caccggca7c77ccacaccgccgccaccc7cgacgac gccaccc7cacc 4 4 00 
ltg i fhtaatlddatlt 
35 aacctcaccccccaacacctcaccaccaccctccaacccaaagccgacgc 4 4 50 
n l t p q h ltttlq p k a d a 
cgcctggcacctccaccaccacacccaaaaccaacccctcacccacttcg 4 500 

awklhhhtqnqplthf 
tcctctactccagcgccgccgccaccctcggcagccccggccaagccaac 4 550 
40 vlyssaaatlgspgqan 

TACGCCGCCGCCAACGCCTTCCTCGACGCCCTCGCCACCCACCGCCACAC 4 600 

YAAANAFLDALAT H RHT 
CCAAGGACAACCCGCCACCACCATCGCCTGGGGCATGTGGCACACCACCA 4 650 
QGQ PATTIAWGMWHTT 
45 CCACACTCACCAGCCAACTCACCGACAGCGACCGCGACCGCATCCGCCGC 4 700 
TTLTSQLTDSDRDRIRR 
GGCGCCTTCCTGCCGATCTCGGACGACGAGGGCATGC 
GGFLPISDDEGM 

50 The NheUXJwl hybrid FK-506 PKS module 8 containing the AT domain of module 

13 of rapamycin is shown below. 

• CCA7GCGGC7G7AGGAGGCGGCACGGCGCACCGGAAG7CGGG7GGTGGTG 50 
MP. LYEAARRTG3PVVV 
GCGGCCGGGG7CGAGGACGCGCCGGACG7GCCGC7GG7GCGCGGGG7GGG 100 
55 AAALDDAPDVPLLRGLR 

.r^\_j i Av.^-— -s-'s_oTCGGGGG7GGGGCCGTCCGGGAA.CGG7 GT G7GG GCGAGC 15 0 

R T T . V R R A A V R E . R S L A D 
GC7CGCCG7GC7GCCCGACGACGAGCGCGCCGACGGC7CGC7GGCG77CG 200 
RSFCCP7TSAP7FPSRS 
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^™GGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 2 50 

= WNSTATVLGHLGAEDI 
CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 
r A T T 7 ~ K I L G I D S 1 7 A 
:> TCCAGC7GCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
VQLRNALTTATGVRLNA 
ACAGCGG7CTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 

VFDFPTPRAL.AARLG 
CGAGGACC7GGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 4 50 
10 - t L A G T R A P V A A R T A A 

CCGGGGGCGCGCACGACGAACCGCTGGCGATCGTGGGCA7GGCC7GCCGT 500 
TAAAH D E ? L A I VGMACR 
G7GCGGGGCGGGG7GGCGTCGCCACAGGAGCTG7GGCG7CTCGTCGCG7C 5 50 
-PGGVASPQELWRLVAS 
1 5 CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
GTDAI TEFPADRGWDV 
ACGCGCTC7 ACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 - 
DALYDPD PDA IGKTFVR 
CACGGCGGCTTCCTGGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 700 
20 HGGFLDCATGFDAAFFG 

GATCAGCCCGCGCG AGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 7 50 

IS PREALAMDPQQRVL" 
TGG AG ACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 8 00 
LETSWEAFESAGITPDA 
25 GCGCGGGGCAdCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 8 50 
ARGS DTGVFIGAFSYGY 
CGGCAGGGG7GCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 

'GTGADTNG FGATGSQT 
GCG7GC7C7CCGGCCGCC7C7CG7ACTTCTACGGTCTGGAGGGCCCTTCG 950 
30 SVLSGRLSYFYGLEGPS 

G7CACGG7CGACACCGCC7GCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 

V7VD7ACSSS LVALHQA 
AGGGCAGTCCCTGCGCTCGGGCGAXTGCTCGCTCGCCCTGGTCGGCGGTG 1050 
GQSLRSGECSLALVGG 
35 TCAGGG7GATGGCG7CGCCCGGCGGATTCGTCGAGTTCTCCGGGCAGCGC 1100 
V7VMAS PGG F-VE FS RQR 
GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 

GLAPCGRAKAFGAGADG 
TAGGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 
40 75 FAEGAGALVVERL S 

ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 12 50 
DAERHGHTVLALVRGSA 
GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 
ANS DGASNGLSAPNGPS 
45 CG AGG.AACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
OERV I HQALANAKLTP 
CCGA7GTCGACGCGG7CGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 
A D V D . A V E . A H G T G T R L G D 
-CCATCGACCCGCAGGCCCTGCTCGCGACGTACGGACAGGACCGGGCGAC 14 50 
50 ? I E A Q A L L A T Y G Q D R A 7 

GCCCC7GCTGC7GGGCTCGC7GAAGTCGAACA7CGGGCACGCCCACGCGG 1500 

FLLLGSL KSNIGHAQA 
CG7CAGGGG7GGCCGGGA7CA7GAAGA7GG7GCAGGCCA7CGGGGACGGG 15 50 
ASGVACIIKMVQAIRHG 
55 GAAC7GCCGCGGACAC7GCACGCGGACGAGCGG7CGCCGCACC7CCAC7G IvOO 
r. LPPTLHADEPSPHVDW 
, A\_o - C J GG7GCCG7CG AGC7CC7G ACG7CGGCCCGGCCG7GGCCGGGGA 165 0 

7 A G A VELLTSARPWF3 
CCGG7CGCCCGCGGGGCGG7GCCG7CTCGTCG7TCGGCG7GAGCGGCAGG 1700 
60 7GRPRRAAVSS FGVSG7 
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AACGCCCACATCATCCT7GAGGCAGGACCGGTCAAAACGGGACCGGTCGA 17 50 

N A H I ILEAGPVKTGPVE 
GGCAGGAGCGATCGAGGCAGGACCGGTCGAAGTAGGACCGGTCGAGGCTG 1300 
A G A I « A G ? 7 £ V G P V £ A 

5 GACCGCTCCCCGCGGCGCCGCCGTCAGCACCGGGCGAAGACCTTCCGCTG 18 50 
G?L PAAPPSAPGEDLPL 
CTCGTGTCGGCGCGTTCCCCGGAGGCACTCGACGAGCAGATCGGGCGCCT 1 900 

LV SARSPEALDEQIGRL 
GCGCGCCTATCTCGACACCGGCCCGGGCGTCGACCGGGCGGCCGTGGCGC 195 0 
10 KAVLDTG PGVDRAAVA 

AGACACTGGCCCGGCGTACGCACTTCACCCACCGGGCCCTACTGCTCGGG 2 000 
QTLARRTHFTHRAVLLG 
GACACCGTCATCGGCGCTCCCCCCGCGGACCAGGCCGACGAACTCGTCTT 2050 
DTVIGAPPADQADELVF 
15 CGTCTACTCCGGTCAGGGCACCCAGCATCCCGCGATGGGCGAGCAGCTAG 2100 
VYSGQGTQH PAMGEQL 
CCGATTCGTCGGTGGTGTTCGCCGAGCGGATGGCCGAGTGTGCGGCGGCG 2150 
ADS S VVFAERMAECAAA 
TTGCGCGAGTTCGTGGACTGGGATCTGTTCACGGTTCTGGATGATCCGGC 2200 
20 LREFVDWDL FTVLDDPA 

GGTGGTGGACCGGGT7GATGTGGTCCAGCCCGCTTCCTGGGCGATGATGG 2 2 50 

"VDRVDVVQPASWA MM 
TTTCCCTGGCCGCGGTGTGGCAGGCGGCCGGTGTGCGGCCGGATGCGGTG 2 300 
VSLAAVWQAAGVRPDAV 
25 ATCGGCCATTCGCAGGGTGAGATCGCCGCAGCTTGTGTGGCGGGTGCGGT 2350 
IGH SQGE I AAACVAGAV 
GTCACTACGCGATGCCGCCCGGATCGTGACCTTGCGCAGCCAGGCGATCG 2 4 00 

SLRDAARIVTLRSQAI 
CCCGGGGCCTGGCGGGCCGGGGCGCGATGGCATCCGTCGCCCTGCCCGCG 24 50 
30ARGLAGRGAMASVALPA 

CAGGATGTCGAGCTGGTCGACGGGGCCTGGATCGCCGCCCACAACGGGCC 2 500 

QDVELVDGAWIAAHNGP 
CGCCTCCACCGTGATCGCGGGCACCCCGGAAGCGGTCGACCATGTCCTCA 2 550 
ASTVIAGT PEAVDHVL 
35 CCGCTCATGAGGCACAAGGGGTGCGGGTGCGGCGGATCACCGTCGACTAT 2 600 
TAHEAQGVP. VRRITVDY 
GCCTCGCACACCCCGC ACCTCGAGC7GATCCGCGACGAACTACTCGACAT 2 650 

ASHTPHVELIRDELLDI 
CACTAGCGACAGCAGCTCGCAGACCCCGCTCGTGCCGTGGCTGTCGACCG 27 0C 
40 TSDSSSQTPLVPWLST 

TGGACGGCACCTGGGTCGACAGCCCGCTGGACGGGGAGTACTGGTACCGG 27 50 
VDGTWVDS PLDGEY WYR 
AACCTGCG.TGAACCGGTCGGTTTCCACCCCGCCGTCAGCCAGTTGCAGGC 28 00 
NLREPVGFH PAVSQLQA 
45 CCAGGGCGACACCGTGTTCGTCGAGGTCAGCGCCAGCCCGGTGTTGTTGC 28 50 
QGDTVFVEVSASPVLL 
AGGCGATGGACGACGATGTCGTCACGGTTGCCACGCTGCGTCGTGACGAC 2 900 
QAM DDDVVTVATLRRDD 
C3GCGACGCCACCCGGATGCTCACCGCCCTGGCACAGGCCTATGTCCACGG 2 950 
50 GDATRMLTALAQAYVHG 

CCTCACCGTCGACTGGCCCGCCATCCTCGGCACCACCACAACCCGGGTAC 3000 

V T V D W P A I LGTTTTRV 
TGGACCTTCCCACCTACGCCTTCCAACACCAGCGGTACTGGCTCGAGTCG 30 5 0 
LDL PTYAFQHQRYWLES 
5 5 GC7CCCCCGGCCACGGCCGACTCGGGCCACCCCGTCCTCGGCACCGGAGT 3100 
APPATADSGHPVLGTGV 
CGGCGTCGCCGGGTC GCCGGGCCGGGTGTTCACGGGTCCCGTGCCCGCCG 315 0 

A V A G S ? G ?. V F T G P V P A 
GTGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGCCGCCGAC 32 00 
60 GADRAVFI AELALAAAD 
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3CGACCGACTGCGCCACGGTCGAACAGCTCGACG7CACC7CCGTGCCCGG 32 5 0 

A T D C A. T V E Q L D V T S V P G 
■-GGATCCGCCCGCGGCAGGGCCACCGCGCAGACC7GGGTCGATGAACCCG 3300 
:: " S « ?■ ^ ?■ A T A Q 7 W V D I P 
} C CGCCGACGGGCGGCGCCCCTTCACCGTCCACACCCGCG7CGGCGACGCC 3 3 50 
AADGRRRFTVHTRVGDA 
CCG7GGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCGTGCCCCA 34 00 

PWTLKAEGVLRPGRVPQ 
GCCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTGCCCGCGG 3 4 50 
10 PEAVDTAWPPPGAVPA 

ACGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGTCGAAGCC 3 500 
DGLPGAWRRADQVFVEA 
G.AAGTCGACAGCCCTGACGGCTTCGTGGCACACCCC3ACCTGCTCGACGC 3 5 50 
Z V D S r DGFVAH ? 3 L L D A 
1 5 G37CTTC7CCGCGG7CGGCGACGGGAGCCGCCA.GCCGACCGGATGGCGCG 3600 
VFSAVGDGSRQPTGWR 
ACCTCGCGGTGCACGCGTCGGACGCCACCGTGC7GCGCGCCTGCCTCACC 3 650 
BLAVHASDATVLRACLT 
CGCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTGCCGGAAT 37 00 
20 RRDSGVVELAAFD GAGM 

GCCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCGTCGGCAG 37 50 

PVLTAESVTLGEVASA 
GCGGATCCGACGAGTCGGACGGTCT.GCTTCGGCTTGAGTGGTTGCCGGTG 3800 
GGS DESDGLLRLEWLPV 
25 GCGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCTACACCCT 38 50 
AEAHY DGADELPEGYTL 
CATCACCGCCACACACCCCGACGACCCCGACGACCCCACCAACCCCCACA 3900 

ITATHPDDPDDPTNPH 
ACACACCCACACGCACCCACACACAAACCACACGCGTCCTCACCGCCCTC 3 950 
30 NT PTRTHTQTTRVLTAL 

CAACACCACCTCATCACCACCAACCACACCCTCATCG7CCACACCACCAC 4 000 

QHHLI TTNHTLIVHTTT 
CGACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCACAAAACG 4 050 
DP PGAAVTGLTRTAQN 
35 AACAGCCCGGCCGCATCCACCTCATCGAAACCCACCACCCCCACACCCCA 4 100 
EHPGRIHLIETHHPHTP 
C7CCCCCTCACCCAACTCACCACCCTCCACCAACCCCACCTACGCCTCAC 4 150 

-PLTQLT7LHQPHLRLT 
7AACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACCCACCACA 4 200 
40 NNTLHTPHLTPITTHH 

ACACCACCACAACCACCCCCAACACCCCACCCCTCAACCCCAACCACGCC 4 250 
NTTTTTPNTPPLNPNHA 
ATCCTCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCA 4 300 
I LI TGGSGTLAGILARH 
45 CCTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCACCCCCCA 4 350 
LNHPHTYLLSRTPPPP 
CCACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCACCCAAATC 4 4 00 
TTPGTHIPCDLTDPTQI 
ACCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCTTCCACAC 4 4 50 
50 T Q A L T H I PQPLTGI FHT 

CGGCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCCCAACACC 4 500 

AATLDDATLTNLTPQH 
7CACCACCACCCTCCAACCCAAAGCCGACGCCGCC7GGCACC7CCACCAC 4 5 50 
L T T T L Q P K A D A A W H L K H 
55 rACACCCAAAACCAJkCCCCTCACCCACTTCG7CC7C7ACTCCAGCGCCGC 4 600 
H TQNQ ?LTH F V 1 Y S SAA 
:rCCACCCTCGGCA"CCCCGGCCAAGCCAACTACGCCGCCGCC^ACGCC7 -.6 50 

7 L G 3 P G Q A N Y A A A' U A 
TCCTCGACGGCCTCGCCACCCACCGCCACACCCAAGGACAACCCGCCACC 4 7 00 
60 f'LDALATHRHTQGQPAT 
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A C C A 7 C " C 7 G G G G C A T G T G G C AC AC C AC C AC C AC AC T C AC C AG C C AAC T 4 7 50 

'-' I A W G M W H T T T 7 L T S Q L 
CACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGC77CCTGCCGATCT 4 8 00 
- P. IRRGGTLPI 

5 CGGACGACGAGGGCATGC 
S D D £ G M 

Example 3 

Recombinant PKS Genes for 13-desmethoxy FK-506 and FK-520 
1 0 The present invention provides a variety of recombinant PKS genes in addition to 

those described in Examples 1 and 2 for producing 13-desmethoxy FK-506 and FK-520 
compounds. This Example provides the construction protocols for recombinant FK-520 and 
FK-506 (from Streptomyces sp. MA6858 (ATCC 55098), described in U.S. Patent Nos. 
5,1 16,756, incorporated herein by reference) PKS genes in which the module 8 AT coding 
1 5 sequences have been replaced by either the rap AT3 (the AT domain from module 3 of the 
rapamycin PKS), rap ATM, eryATl (the AT domain from module 1 of the erythromycin 
(DEBS) PKS), or eryAT2 coding sequences. Each of these constructs provides a PKS that 
produces the 13-desmethoxy-13-methyl derivative, except for the rapAT12 replacement, 
which provides the 13-desmethoxy derivative, i.e., it has a hydrogen where the other 
20 derivatives have methyl. 

Figure 7 shows the process used to generate the AT replacement constructs. First, a 
fragment of -4.5 kb containing module 8 coding sequences from the FK-520 cluster of 
ATCC 1 4891 was cloned using the convenient restriction sites Sacl and Sphl (Step A in 
Figure 7). The choice of restriction sites used to clone a 4.0 - 4.5 kb fragment comprising 
25 module 8 coding sequences from other FK-520 or FK-506 clusters can be different 

depending on the DNA sequence, bjit the overall scheme is identical. The unique Sacl and 
Sphl restriction sites at the ends of the FK-520 module 8 fragment were then changed to 
unique Bgl II and Nsil sites by ligation to synthetic linkers (described in the preceding 
Examples, see Step B of Figure 7). Fragments containing sequences 5' and 3* of the AT8 
30 sequences were then amplified using primers, described above, that introduced either an 

Avrll site or an Nhel site at two different KS/AT boundaries and an XIiol site at the AT/DH 
boundary (Step C of Figure 7). Heterologous AT domains from the rapamycin and 
erythromycin gene clusters were amplified using primers, as described above, that 
introduced the same sites as just described (Step D of Figure 7). The fragments were ligated 
35 10 give hybrid modules with in-frame fusions at the KS/AT and AT/DH boundaries (Step E 
of Figure 7). Finally, these hybrid modules were ligated into the BamHl and Pstl sites of the 
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KC515 vector. The resulting recombinant phage were used to transform the FK-506 and 
FK-520 producer strains to yield the desired recombinant cells, as described in the 
preceding Examples. 

The following table shows the location and sequences surrounding the engineered 
5 site of each of the heterologous AT domains employed. The FK-506 hybrid construct was 
used as a control for the FK-520 recombinant cells produced, and a similar FK-520 hybrid 
construct was used as a control for the FK-506 recombinant cells. 



Heterologous AT 


| Enzyme 


Location of Engineered Site 


FK-506 ATS 


Avrll 


GGCCCTccqcqcCGTGCGGCGGTCTCGTmTTr 


(hydroxymalonyl) 


Nhel 


GRPRRAAVSSF 
ACCCAGCATCCCGCGATGGGTGAGCGqct cacC 




TQH.PAMGERLA 




Xhol 


TACGCCTTCCAGCGGCGGCCCTACTGGatcqaq 




YAFQRRpywiE 


rapamycin AT3 
(methylmalonyl) 


Avrll 
Nhel 


GACCGGccccqtCGGGCGGGCGTGTCGTCCTTC 

DRPRRAGVSS 
TGGCAGTGGCTGGGGATGGGCAGTGCcct qcqG 




WQWLGMGSALR 




Xhol 


TACGCCTTCCAACACCAGCGGTACTGGqtcqaq 
YAFQHQRYWVE 


rapamycin AT 12 


Avrll 


GGCCGAacqcqcCGGGCAGGCGTGTCGTCCTTC 


(malonyl) 


Nhel 


GRARRAGVSSF 
TCGCAGCGTGCTGGCATGGGTGAGGAactqqcC 




SQRAGMGEELA 




Xhol 


TACGCCTTCCAGCACCAGCGCTACTGGctcaaq 
YAFQHQRYWLE 


DEBS ATI 
(methylmalonyl) 


Avrll 
Nhel 


GCGCGAccqcacCGGGCGGGGGTCTCGTCGTTC 

ARPRRAGVSS F 
TGGCAGTGGGCGGGCATGGCCGTCGAcct ac^C 




WQWAGMAVDLL 




Xhol 


TACCCGTTCCAGCGCGAGCGCGTCTGGctcaaa 
yPFQRERVWLE 


DEBS AT2 


Avrll 


GACGGGatacqcCGGGCAGGTGTGTCGGCGTTC 


(methylmalonyl) 


Nhel 


DGVRRAGVSAF 
GCCCAGTGGGAAGGCATGGCGCGGGAqt t qt t G 
AQW. EGMARELL 




Xhol 


TATCCTTTCCAGGGCAAGCGGTTCTGGctactq 
YPFQGKRFWLL | 



10 The sequences shown below provide the location of the KS/AT boundaries chosen 

in the FK-520 module S coding sequences. Regions where Avrll and Nhel sites were 
engineered are indicated by lower case and underlining. 
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^GSGCCCCGTCGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGG ccacQq C 
AGAV£LLTSARPWPETDRPR 

GTGCCGCCGTCTCCTCGTTCGGGGTGAGCGGCACCAACGCCCACGTCATCCTGGAGGCCG 

? A A v s s r G '/ $ G T ; I A H v : 1 I a 

3 GACCCGTAACGGAGACGCCCGCGGCATCGCCTTCC3GTGACCTTCCCCTGCTGGTGTCGG 
GPVTET PAASPSGDLPLL.VS 
CACGCTCACCGGAAGCGCTCGACGAGCAGATCCGCCGACTGCGCGCCTACCTGGACACCA 

a R s psaldeqirrlrayldt 

CCCCGGACGTCGACCGGGTGGCCGTGGCACAGACGC7GGCCCGGCGCACACACTTCGCCC 
10 TPDVDRVAVAQTLARR THFA 

ACCGCGCCGTGCTGCTCGGTGACACCGTCATCACCACACCCCCCGCGGACCGGCCCGACG 
HRAVLLGDTVITTPPADRPD 
AACTCGTCTTCGTCTACTCCGGCCAGGGCACCCAGCATCCCGCGATGGGCGAGCA qct eg 
ELVFVYSGQGTQH PAMGEQL 
15 CCGCCGCCCATCCCGTG7TCGCCGACGCCTGGCATGAAGCGCTCCGCCGCCTTGACAACC 
AAAH PVFADAWHEALRRLDN 

The sequences shown below provide the location of the AT/DH boundary chosen in 
the FK-520 module 8 coding sequences. The region where an Xhol site was engineered is 
20 indicated by lower case and underlining. 

TCCTCGGGGCTGGGTCACGGCACGACGCGGATGTGCCCGCGTACGCGTTCCAACGGCGGC 
I LGAG3RH DADVPAYAFQRR 
ACTACTGGatc^aaTCGGCACGCCCGGCCGCATCCGACGCGGGCCACCCCGTGCTGGGCT 
HYW I ESARPAASDAGHPVLG 

25 

The sequences shown below provide the location of the KS/AT boundaries chosen 
in the FK-506 module 8 coding sequences. Regions where Avrll and Nhel sites were 
engineered are indicated by lower case and underlining. 

TCGGCCAGGCCGTGGCCGCGGACCGGCCGT ccqcqc CGTGCGGCGGTCTCGTCGTTCGGG 
30 SARPWPRTGRP .RRAAVSS FG 

GTGAGCGGCACCAACGCCCACATCATCCTGGAGGCCGGACCCGACCAGGAGGAGCCGTCG 

V S G T N A H I I LEAGPDQEEPS 
GCAGAACCGGCCGGTGACCTCCCGCTGCTCGTGTCGGCACGGTCCCCGGAGGCACTGGAC 

A E PAG D L P L LVSA RS PEA L D 
35 GAGCAGATCGGGCGCCTGCGCGACTATCTCGACGCCGCCCCCGGCGTGGACCTGGCGGCC 

lQIGRLRDY LD AAPGVDLAA 
GTGGCGCGGACACTGGCCACGCGTACGCACTTCTCCCACCGCGCCGTACTGCTCGGTGAC 

VARTLATRTHFSHRAVLLGD 
ACCGTCATCACCGCTCCCCCCGTGGAACAGCCGGGCGAGCTCGTCTTCGTCTACTCGGGA 
40 TVITAPPVEQPGELVFVYSG 
CAGGGCACCCAGCATCCCGCGATGGGTGAGCG qctcac CGCAGCCTTCCCCGTGTTCGCC 

QGTQH PAMGERLAAAFPV FA 
GACCCGGACGTACCCGCCTACGCCTTCCAGCGGCGGCCCTACTGGATCGAGTCCGCGCCG 

OPDVPAYAFQRRPYWIESAP 

45 

The sequences shown below provide the location of the AT/DH boundary chosen in 
the FK-506 module 8 coding sequences. The region where an Xhol site was engineered is 
indicated by lower case and underlining. 

GACCCGGACGTACCCGCCTACCCCTTCCAGCGGCGGCCCTACTGGat_coaaTCCGCGCCG 
50 : p D V ? A Y A 7 Q R R F Y W I E 5 A ? 
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Replacement of Methoxyl with Hydrogen or Methyl at C-15 of FK-506 and FK-520 
The methods and reagents of the present invention also provide novel FK-506 and 
FK-520 derivatives in which the methoxy group at C-15 is replaced by a hydrogen or 
methyl. These derivatives are produced in recombinant host cells of the invention that 
5 express recombinant PKS enzymes the produce the derivatives. These recombinant PKS 
enzymes are prepared in accordance with the methodology of Examples 1 and 2, with the 
exception that AT domain of module 7, instead of module 8, is replaced. Moreover, the 
present invention provides recombinant PKS enzymes in which the AT domains of both 
modules 7 and 8 have been changed. The table below summarizes the various compounds 
1 0 provided by the present invention. 





Compound 


C-13 


C-15 


Of*ri vati Vf* Prn\/irl#»H 
i— / wi i v an v c ri vVlUCU 




FK-506 


hvdrooen 


hvdroffen 


* Ji i j umwdiuvUiUAy'r ^~juo 




FK-506 


hvdrocen 


methoxv 

HiWlliUAJr 


1 l-de^mpthnYV-FK'-^fiA 

* -J UWdUIwUlUAjr ri\ JvU 


1 5 


r iv-juo 


nyarogen 


methyl 


i J, l s-aiaesmetnoxy- 1 5-methy l-FK-506 




FK-506 


methoxy 


hydrogen 


1 S-desmethoxy-FK-506 




FK-506 


methoxy 


methoxy 


Original Compound — FK-506 




FK-506 


methoxy 


methyl 


1 5-desmethoxy- 1 5-methy l-FK-506 




FK-506 


methyl 


hydrogen 


1 3, 1 5-didesmethoxy- 1 3-methy l-FK-506 


20 


FK-506 


methyl 


methoxy 


1 3-desmethoxy- 1 3-methyl-FK-506 




FK-506 


methyl 


methyl 


13,1 5-didesmethoxy- 1 3,1 5-dimethy l-FK-506 




FK-520 


hydrogen 


hydrogen 


1 3, 1 5-didesmethoxy FK-520 




FK-520 


hydrogen - 


methoxy 


1 3-desmethoxy FK-520 




FK-520 


hydrogen 


methyl 


1 3, 1 5-didesmethoxy- 1 5-methyl-FK-520 


25 


FK-520 


methoxy 


hydrogen 


1 5-desmethoxy-FK-520 




FK-520 


methoxy 


methoxy 


Original Compound FK-520 




FK-520 


methoxy 


methyl 


1 5-desmethoxy- 1 5-methyl-FK-520 




FK-520 


methyl 


hydrogen 


13,1 5-didesmethoxy- 1 3-methyl-FK-520 




FK-520 


methyl 


methoxy 


1 3-desmethoxy- 1 3-methyl-FK-520 


30 


FK-520 


methyl 


methyl 


13,1 5-didesmethoxy- 13,1 5-dimethyl-FK-520 



Example 5 

Replacement of Methoxyl with Ethyl at C-13 and/or C-15 of FK-506 and FK-520 
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The present invention also provides novel FK-506 and FK-520 derivative 
compounds in which the methoxy groups at either or both the C-13 and C-15 positions are 
instead ethyl groups. These compounds are produced by novel PKS enzymes of the 
invention in which the AT domains of modules 8 and/or 7 are converted to ethylmalonyl 
5 specific AT domains by modification of the PKS gene that encodes the module. 

Ethylmalonyl specific AT domain coding sequences can be obtained from, for example, the 
FK-520 PKS genes, the niddamycin PKS genes, and the tylosin PKS genes. The novel PKS 
genes of the invention include not only those in which either or both of the AT domains of 
modules 7 and 8 have been converted to ethylmalonyl specific AT domains but also those in 
1 0 which one of the modules is converted to an ethylmalonyl specific AT domain and the other 
is converted to a malonyl specific or a methylmalonyl specific AT domain. 

Example 6 
Neurotrophic Compounds 

15 The compounds described in Examples 1 - 4, inclusive have immunosuppressant 

activity and can be employed as immunosuppressants in a manner and in formulations 
similar to those employed for FK-506. The compounds of the invention are generally 
effective for the prevention of organ rejection in patients receiving organ transplants and in 
particular can be used for immunosuppression following orthotopic liver transplantation. 

20 These compounds also have pharmacokinetic properties and metabolism that are more 
advantageous for certain applications relative to those of FK-506 or FK-520. These 
compounds are also neurotrophic; however, for use as neurotrophic, it is desirable to 
modify the compounds to diminish or abolish their immunosuppressant activity. This can be 
readily accomplished by hydroxylating the compounds at the C- 18 position using 

25 established chemical methodology or novel FK-520 PKS genes provided by the present 
invention. 

Thus, in one aspect, the present invention provides a method for stimulating nerve 
growth that comprises administering a therapeutically effective dose of 18-hydroxy-FK- 
520. In another embodiment, the compound administered is a C-18,20-dihydroxy-FK-520 
30 derivative. In another embodiment, the compound administered is a C-13-desmethoxy 
and/or C-15-desmethoxy 18-hydroxy-FK-520 derivative. In another embodiment, the 
compound administered is a C-13-desmethoxy and/or C-15-desmethoxy 18,20-dihydroxy- 
FK-520 derivative. In other embodiments, the compounds are the corresponding analogs of 
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FK-506. The 18-hydroxy compounds of the invention can be prepared chemically, as 
described in U.S. Patent No. 5 ; 189,042, incorporated herein by reference, or by 
fermentation of a recombinant host cell provided by the present invention that expresses a 
recombinant PKS in which the module 5 DH domain has been deleted or rendered non- 
5 functional. 

The chemical methodology is as follows. A compound of the invention (-200 mg) is 
dissolved in 3 mL of dry methylene chloride and added to 45 \iL of 2,6-lutidine, and the 
mixture stirred at room temperature. After 10 minutes, tert-butyldimethylsilyl 
trifluoromethanesulfonate (64 jiL) is added by syringe. After 15 minutes, the reaction 
10 mixture is diluted with ethyl acetate, washed with saturated bicarbonate, washed with brine, 
and the organic phase dried over magnesium sulfate. Removal of solvent in vacuo and flash 
chromatography on silica gel (ethyl acetate: hexane (1:2) plus 1% methanol) gives the 
protected compound, which is dissolved in 95% ethanol (2.2 mL) and to which is added 53 
jiL of pyridine, followed by selenium dioxide (58 mg). The flask is fitted with a water 
1 5 condenser and heated to 70°C on a mantle. After 20 hours, the mixture is cooled-.to room 
temperature, filtered through diatomaceous earth, and the filtrate poured into a saturated 
sodium bicarbonate solution. This is extracted with ethyl acetate, and the organic phase is 
washed with brine and dried over magnesium sulfate. The solution is concentrated and 
purified by flash chromatography on silica gel (ethyl acetate: hexane (1 :2) plus 1% 
20 methanol) to give the protected 18-hydroxy compound. This compound is dissolved in 
acetonitrile and treated with aqueous HF to remove the protecting groups. After dilution 
with ethyl acetate, the mixture is washed with saturated bicarbonate and brine, dried over 
magnesium sulfate, filtered, and evaporated to yield thel8-hydroxy compound. Thus, the 
present invention provides the C- 1 8-hydroxyl derivatives of the compounds described in 
25 Examples 1 - 4. 

Those of skill in the art will recognize that other suitable chemical procedures can be 
used to prepare the novel 1 8-hydroxy compounds of the invention. See, e.g., Kawai et al. n 
Jan. 1993, Structure-activity profiles of macrolactam immunosuppressant FK-506 
analogues, FEES Letters 316(2): 107-1 13, incorporated herein by reference. These methods 
30 can be used to prepare both the C18-[5]-OH and C18-[/?]-OH enantiomers, with the R 

enantiomer showing a somewhat lower ICso, which may be preferred in some applications. 
See Kawai ei al. n supra. Another preferred protocol is described in Umbreit and Sharpless, 
1977, JACS 99(16): 1526-28, although it may be preferable to use 30 equivalents each of 
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Se0 2 and t-BuOOH rather than the 0.02 and 3-4 equivalents, respectively, described in that 
reference. 

All scientific and patent publications referenced herein are hereby incorporated by 
reference. The invention having now been described by way of written description and 
example, those of skill in the art will recognize that the invention can be practiced in a 
variety of embodiments, that the foregoing description and example is for purposes of 
illustration and not limitation of the following claims. 
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Claims 

1 . An isolated nucleic acid that encodes a CoA ligase, a non-ribosomal peptide 
synthetase, or a domain of an extender module of a polyketide synthase enzyme that 
synthesizes FK-520. 

5 

2. The isolated nucleic acid of claim 1 that encodes an extender module, said 
module comprising a ketosynthase domain, an acyl transferase domain, and an acyl carrier 
protein domain. 

10 3. The isolated nucleic acid of claim 1 that encodes an open reading frame, said 

open reading frame comprising coding sequences for two or more extender modules, each 
extender module comprising a ketosynthase domain, an acyl transferase domain, and an 
acyl carrier protein domain. 

15 4. The isolated nucleic acid of claim 1 that encodes a gene cluster, said gene cluster 

comprising two or more open reading frames, each of said open reading frames comprising 
coding sequences for two or more extender modules, each of said extender modules 
comprising a ketosynthase domain, an acyl transferase domain, and an acyl carrier protein 
domain. 

20 

5. The isolated nucleic acid of claim 2, wherein at least one of said domains is a 
domain of a module of a non-FK-520 polyketide synthase. 

6. The isolated nucleic acid of claim 1, wherein said nucleic acid is a recombinant 
25 vector capable of replication in or integration into the chromosome of a host cell. 

7. The isolated nucleic acid of claim 6 that is selected from the group consisting of 
cosmid pKOS034-120, cosmid pKOS034-124, cosmid pKOS065-M27, and cosmid 
pKOS065-M21. 



30 



8. The isolated nucleic acid of claim 5, wherein said non-FK-520 polyketide 
synthase is rapamycin polyketide synthase. FK-506 polyketide synthase, or erythromcyin 
polyketide synthase. 
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9. A method of preparing a polyketide, said method comprising transforming a 
host cell with a recombinant DNA vector of claim 6, and culturing said host cell under 
conditions such that said polyketide synthase is produced and catalyzes synthesis of said 

5 polyketide. 

1 0. The method of claim 9, wherein said host cell is a Streptomyces host cell. 

1 1 . The method of claim 9, wherein said polyketide is selected from the group 
1 0 consisting of FK-520, 1 3-desmethoxy-FK-520, and 1 3-desmethoxy-FK-506. 

1 2. A recombinant host cell that expresses a recombinant polyketide synthase 
selected from the group consisting of: (i) an FK-520 polyketide synthase in which at least 
one AT domain is replaced by an AT domain of a non-FK-520 polyketide synthase; (ii) an 

1 5 FK-506 polyketide synthase in which at least one AT domain is replaced by an AT domain 
of a non-FK-506 polyketide synthase; (iii) an FK-520 polyketide synthase in which at least 
one DH domain has been deleted; (iv) an FK-506 polyketide synthase in which at least one 
DH domain has been deleted. 

20 13. The recombinant host cell of claim 12 that expresses an FK-520 polyketide 

synthase in which an AT domain of module 8 has been replaced by an AT domain that 
binds malonyl CoA, methylmalonyl CoA, or ethylmalonyl CoA* 

14. The recombinant host cell of claim 12 that expresses an FK-506 polyketide 
25 synthase in which an AT domain of module 8 has been replaced by an AT domain that 

binds malonyl CoA, methylmalonyl CoA, or ethylmalonyl CoA. 

15. The recombinant host cell of claim 13, wherein a DH domain of module 5 or 
module 6 has been deleted. 



30 



1 6. The recombinant host cell of claim 14, wherein a DH domain of module 5 or 
module 6 has been deleted. 
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17. A recombinant host cell that comprises recombinant genes coding for enzymes 
sufficient for synthesis of ethylmalonyl CoA or 2-hydroxymalonyl CoA. 

18. A polyketide having the structure 




wherein, R| is hydrogen, methyl, ethyl, or allyl; R 2 is hydrogen or hydroxyl, provided that 
when R 2 is hydrogen, there is a double bond between C-20 and C-19; R 3 is hydrogen or 

It 

10 hydroxyl; R4 is methoxyl, hydrogen, methyl, or ethyl; and R 5 is methoxyl, hydrogen, 

methyl, or ethyl; but not including FK-506, FK-520, 1 8-hydroxy-FK-520, and 1 8-hydroxy- 
FK-506. 

19. The polyketide of claim 18 that is 13-desmethoxy-FK-506. 

15 

20. The polyketide of claim 18 that is 13-desmethoxy-18-hydroxy-FK-520. 
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